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PREFACE 


This  volume  is  part  of  a  five- volume  set  that  summarizes  the  research  of  participants  in  the  1996  AFOSR 
Summer  Research  Extension  Program  (SREP.)  The  current  volume.  Volume  1  of  5,  presents  the  final 
reports  of  SREP  participants  at  Armstrong  Laboratory.  Volume  1  also  includes  the  Management  Report. 

Reports  presented  in  this  volume  are  arranged  alphabetically  by  author  and  are  numbered  consecutively  - 
e.g.,  1-1,  1-2,  1-3;  2-1,  2-2,  2-3,  with  each  series  of  reports  preceded  by  a  35  page  management  summary. 
Reports  in  the  five-volume  set  are  organized  as  follows: 


VOLUME 

1 

2 

3 

4A 

4B 

5 


TITLE 

Armstrong  Laboratory 
Phillips  Laboratory 
Rome  Laboratory 
Wright  Laboratory 
Wright  Laboratory 

Arnold  Engineering  Development  Center 
Air  Logistics  Centers 
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Hybrid  Evolutionary  Learning  System 

Mateen  M.  Rizki 
Associate  Professor 

Department  of  Computer  Science  and  Engineering 
Wright  State  University 

Abstract 

E-MORPH  is  a  multi-phase  evolutionary  learning  system  that  evolves  cooperative  sets  of  feature  detectors  and 
combines  their  response  using  a  simple  nearest  neighbor  classifier  to  form  a  complete  pattern  recognition  system. 
The  learning  system  evolves  registered  sets  of  primitive  morphological  detectors  that  directly  measure  normalized 
radar  signatures.  Special  convolution  kernels  are  evolved  to  extract  information  from  the  output  of  the  primitive 
transforms  to  form  real  valued  feature  vectors.  Starting  with  a  population  of  trivial  randomly  generated  transforms, 
EMORPH  uses  a  novel  combination  of  three  evolutionary  learning  techniques,  genetic  programming  (GP), 
evolutionary  programming  (EP),  and  genetic  algorithms  (GA)  to  evolve  complete  pattern  recognition  systems.  The 
GP  grows  complex  mathematical  expressions  that  perform  signal-to-signal  transformations,  EP  optimizes 
convolution  templates  to  process  the  results  of  these  transformations,  and  the  GA  combines  sets  of  feature  detectors 
to  form  orthogonal  features.  A  simple  nearest  neighbor  classifier  is  used  to  classify  the  resulting  features  forming  a 
complete  pattern  recognition  system.  This  report  provides  a  brief  description  of  E-MORPH  and  presents  recognition 
results  for  the  problem  of  classifying  high  range  resolution  radar  signatures.  This  problem  is  challenging  because  the 
data  sets  exhibit  a  large  within  class  variation  and  poor  separation  between  classes.  The  specific  data  set  used  in  this 
experiment  consists  of  60  signatures  of  six  airborne  targets  drawn  from  a  1°  x  10°  (azimuth  x  elevation)  view 
window.  The  best  recognition  system  evolved  using  EMORPH  accurately  classified  100%  of  the  training  signatures 
(6  targets  x  5  samples  =  30  signatures)  and  90.0%  of  the  signatures  in  an  independent  test  set  (6  targets  x  5  samples  = 
30  signatures).  This  result  is  based  on  a  preliminary  experiment  that  did  not  involve  tuning  EMORPH’s  control 
parameters  for  this  specific  problem.  This  suggests  that  even  better  performance  can  be  achieved  in  future 
experiments.  The  techniques  used  in  E-MORPH  are  not  tied  to  radar  signals.  The  approach  is  generic  and  readily 
transitions  to  many  different  problems  in  automatic  target  recognition. 
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HYBRID  EVOLUTIONARY  LEARNING  SYSTEM 


Mateen  M.  Rizki 
Associate  Professor 

Department  of  Computer  Science  and  Engineering 
Wright  State  University 


INTRODUCTION 

The  foundation  of  a  robust  pattern  recognition  system  is  the  set  of  features  used  to  distinguish  among  the  given 
patterns.  In  many  problems,  the  features  are  predetermined  and  the  task  is  to  build  a  system  to  extract  the  selected 
features  and  then  classify  the  resultant  measurements.  In  automatic  target  recognition  problems,  the  identification  of 
a  set  of  robust,  invariant  features  is  complicated  because  the  shape  and  orientation  of  the  objects  of  interest  are  often 
not  known  a  priori.  As  a  result,  a  human  expert  is  responsible  of  examining  each  problem  to  formulate  an  effective 
set  of  features  and  then  build  a  system  to  perform  the  recognition  task.  An  alternative  to  this  labor  intensive  approach 
of  building  recognition  systems  has  emerged  in  the  past  ten  years  that  uses  learning  algorithms  such  as  neural 
networks  and  genetic  algorithms  to  automate  the  process  of  feature  extraction.  There  are  many  advantages  to  the 
automated  construction  of  recognition  systems  over  techniques  that  rely  solely  on  human  expertise.  Automated 
approaches  are  not  problem  specific.  Consequently,  once  an  automated  system  is  developed,  it  can  be  readily  applied 
to  similar  problems  greatly  reducing  the  time  needed  to  solve  new  recognition  problems.  Automated  systems  are 
capable  of  producing  solutions  that  are  comparable  to  the  customized  solutions  created  by  human  experts,  but  the 
solutions  formed  by  these  systems  are  often  non-intuitive  and  quite  different  from  the  solutions  formed  by  human 
experts.  In  many  applications,  this  is  a  drawback  because  it  is  not  possible  to  describe  how  the  solution  is  obtained. 
This  is  also  a  strength  of  the  automated  approach.  Automated  techniques  are  unbiased.  The  features  selected  to  solve 
problems  represent  alternative  designs  based  on  the  structural  and  statistical  attributes  of  the  data.  The  fact  that 
different  features  are  selected  suggests  that  automated  systems  are  capable  of  exploring  different  regions  of  the  space 
of  potential  solutions. 

Several  automated  target  recognition  systems  exist  that  use  evolutionary  learning  to  extract  features  from  raw  data 
and  perform  classification  [Rizki  et  al.  1993,  1994].  Early  experiments  with  EMORPH,  a  system  developed  to 
evolve  morphological  algorithms,  demonstrated  that  hybrid  evolutionary  learning  systems  are  capable  of  generating 
pattern  recognition  systems  to  automatically  perform  feature  extraction  and  classification  from  grey-scale  images.  In 
this  system,  a  robust  set  of  features  is  identified  using  a  population  of  pattern  recognition  systems.  Each  system  is 
composed  of  a  collection  of  cooperative  feature  detectors  and  a  classifier  that  evolves  under  the  control  of  a  user 
provided  performance  measure.  The  performance  measure  is  tied  to  recognition  accuracy,  but  additional  constraints 
are  included  such  as  complexity  measures  to  sculpt  specific  types  of  solutions.  The  recognition  systems  compete  for 
survival  based  on  their  performance.  Successful  systems  have  a  higher  probability  of  survival  and  contribute  more 
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information  to  future  generations.  The  structural  and  statistical  information  gathered  by  each  recognition  system 
during  the  evolutionary  process  is  passed  to  the  next  generation  through  a  process  of  reproduction  with  variation. 
The  most  successful  recognition  systems  are  combined  to  form  new  recognition  systems  that  are  often  superior  to 
either  parental  unit.  Two  opposing  forces  operate  in  the  evolutionary  process:  exploration  and  exploitation.  By 
recombining  successful  solutions  during  reproduction,  each  generation  contains  recognition  systems  that  are  more 
capable  of  exploiting  the  performance  measure  and  solving  the  recognition  task.  The  reproductive  process  is 
imperfect,  variations  in  the  new  recognition  systems  are  created  by  mutating  the  structure  of  the  feature  detectors. 
Each  new  recognition  system  contains  pieces  of  past  successful  designs  with  variations  that  explore  alternative 
designs.  The  process  of  reproduction  with  variation  and  selection  continues  until  the  best  recognition  system  in  the 
population  achieves  a  satisfactory  level  of  performance. 

This  report  describes  experiments  conducted  using  a  modified  version  of  EMORPH  to  evolve  pattern  recognition 
systems  to  classify  high  range  resolution  radar  cross  sections.  This  version  of  EMORPH  blends  three  evolutionary 
learning  paradigms;  evolutionary  programming  [Fogel  et  el.,  1966;  Fogel,  1991],  genetic  algorithms  [Holland,  1975; 
Goldberg,  1989],  and  genetic  programming  [Koza,  1992]  to  form  a  hybrid  learning  system.  The  system  extracts 
features  from  a  training  set  of  radar  cross  sections,  assembles  cooperative  sets  of  features,  and  forms  a  nearest 
neighbor  classifier  to  label  targets.  A  minimum  amount  of  effort  was  devoted  to  tuning  the  EMORPH  for  the  specific 
problem  of  classifying  radar  target,  yet  the  evolved  pattern  recognition  systems  accurately  classify  an  independent 
test  set  of  radar  cross  sections. 

THE  EVOLUTIONARY  T. EARNING  SYSTF.M 


The  overall  design  of  the  EMORPH  generated  recognition  system  is  shown  in  Figure  1.  A  recognition  system  is 
composed  of  a  feature  extraction  module  and  a  classification  module.  The  feature  extraction  module  applies  a  set  of 
feature  detectors  to  each  radar  cross  section  to  form  a  feature  vector.  The  classifier  then  assigns  a  target  label  to  each 
feature  vector.  A  feature  detector  is  composed  of  two  components,  a  transformation  and  a  cap.  Transformations  are 
networks  of  morphological  and  arithmetic  operations  that  alter  the  signal  in  an  attempt  to  enhance  the  most 
discriminating  regions  of  the  radar  cross  sections  while  suppressing  noise.  Caps  are  convolution  kernels  or  templates 
composed  of  a  collection  of  positive  and  negative  Gaussian  probes  that  are  used  to  explore  both  the  geometrical 
structure  and  contrast  variation  of  the  transformed  signal.  The  convolution  operator  produces  its  strongest  response 
when  all  of  the  positive  and  negative  probe  points  align  with  similar  regions  in  the  signal.  Consequently,  when  a  cap 
produces  a  strong  response,  it  indicates  that  geometry  and  contrast  variation  embodied  in  the  cap  also  exists  in  the 
transformed  signal.  By  adjusting  the  positions,  values,  and  spread  of  the  probe  points,  complex  structural 
relationships  are  readily  identified.  The  output  of  a  detector  set  is  a  registered  stack  of  processed  radar  cross  sections. 
The  set  of  caps  present  in  a  single  recognition  system  is  a  registered  set  of  convolution  templates  that  serves  as  a  3D 
probe.  This  probe  allows  the  recognition  system  to  explore  relationships  within  a  single  stack-plane  (transformed 
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Figure  1.  Overview  of  an  evolved  pattern  recognition  system. 

view  of  the  input  signal)  or  across  several  stack-planes.  By  varying  the  parameters  of  each  transform,  the  feature 
extraction  module  can  decompose  a  signal  into  different  spatial  frequencies  creating  a  pseudo-wavelet 
transformation.  When  this  occurs,  the  3D  probe  can  exploit  multiple  resolution  levels  to  selectively  mask  noisy  high 
frequency  spikes  leaving  only  the  most  prominent  structures  for  further  analysis.  When  the  full  set  of  detectors  is 
applied  to  a  radar  signal,  a  real  valued  feature  vector  is  produced.  Each  detector  contributes  one  component  to  the 
vector.  By  repeating  this  process  for  all  the  signals,  a  feature  matrix  is  created  that  is  used  to  form  a  nearest  neighbor 
classifier  for  target  classification. 

The  E-MORPH  system  forms  feature  detectors  by  creating  a  pool  of  signal  transformation  as  shown  in  Figure  2. 
These  transformations  are  evaluated  using  local  performance  measures  that  attempt  to  evaluate  the  information 
content  of  each  transformed  signal.  The  results  of  these  transforms  are  capped  using  convolution  templates  and  the 
results  of  the  capped  transforms  are  evaluated  using  a  second  local  performance  measure.  Finally,  capped  transforms 
are  selected  to  form  a  cooperative  set  of  feature  detectors  that  are  evaluated  by  forming  a  nearest  neighbor  classifier 
to  evaluate  recognition  accuracy.  After  each  recognition  system  is  assigned  a  performance  measure,  is  competes  for 
survival  with  other  members  of  the  population.  The  competition  is  organized  as  a  tournament  that  ranks  each 
recognition  system  based  on  its  performance  relative  to  the  performance  of  other  systems  in  the  population.  The  size 
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Figure  2.  Overview  of  the  EMORPH  system. 


of  the  tournament  changes  throughout  the  evolutionary  process  and  is  based  on  the  average  performance  of  the 
population  as  shown  in  Equation  1 . 


NC  =  max 


N 


N 


Equation  1 


In  this  Equation,  NC  is  the  number  of  competitors  in  each  tournament,  N  is  the  population  size,  and  is  a  user 


imposed  upper  limit  on  the  number  of  competitions  (M  <=  N).  Each  recognition  system  must  win  as  many  conflicts 
as  possible  to  increase  its  chance  for  survival.  The  number  of  competitions  won  or  lost  is  calculated  using  Equation  2. 
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Equation  2 


In  these  local  competitions,  the  chance  of  winning  is  proportional  to  the  ratio  of  the  performance  measure  of  the 
recognition  system  and  its  competitor.  For  example,  if  a  recognition  system's  performance  (pm;)  is  high  and  a 
randomly  selected  competitor's  performance  (pm2*N*U(0,l))  is  low,  then  the  probability  that  the  ratio  is  greater  than  a 
value  drawn  from  a  uniformly  distributed  random  variable  U(0,  1)  is  also  high.  When  the  relationship  shown  in 
Equation  2  is  satisfied,  the  recognition  system  wins  the  pairwise  competition.  Limiting  the  tournaments  to  a  subset  of 
the  population  reduces  the  possibility  of  premature  convergence  of  the  evolutionary  process.  When  the  average 
performance  of  the  population  is  poor,  the  number  of  individuals  in  each  tournament  is  small  and  a  marginally  better 
recognition  system  does  not  have  the  opportunity  to  dominate  the  population.  The  pairwise  competition  used  within 


17-6 


each  tournament  tends  to  maintain  a  diverse  population  of  recognition  systems  because  marginal  individuals  always 
have  a  small  probability  of  survival.  The  final  selection  for  survival  is  based  on  a  ranking  of  the  number  of  conflicts 
won  by  each  recognition  system.  The  sets  with  the  greatest  number  of  victories  survive  to  the  next  learning  cycle. 

E-MORPH  uses  three  different  techniques  to  alter  the  structure  of  the  detector  set  contained  in  each  recognition 
system.  The  position  of  the  Gaussian  points  in  the  convolution  templates  within  a  detector  set  are  varied  using 
evolutionary  programming  [Fogel,  1991],  the  functional  form  of  the  transformation  is  modified  using  genetic 
programming  [Koza,  1992],  and  the  collection  of  detectors  that  form  the  basis  of  the  feature  extraction  module  are 
selected  using  a  genetic  algorithm  [Holland,  1975].  These  techniques  are  combined  to  exploit  the  strengths  of  each 
paradigm. 

EMORPH  uses  genetic  programming  (GP)  to  grow  signal  transformations.  Transformations  are  networks  of 
morphological,  arithmetic,  and  special  operators  that  are  represented  as  expression  trees.  Each  expression  performs  a 
mathematical  transformation  of  the  input  signal.  The  performance  of  a  transformation  is  evaluated  using  a  local 
performance  measure  that  consists  of  a  weighted  sum  of  the  total  energy  of  the  transformed  signal,  the  number  of 
peaks  in  the  signal,  the  magnitude  of  the  strongest  peaks,  and  the  distance  between  the  strongest  peaks.  In  addition, 
the  performance  is  adjusted  so  that  transforms  producing  similar  effects  receive  a  diminished  score.  The  GP 
algorithm  operates  on  a  population  of  transformation.  Parental  units  are  selected  from  the  base  population  using 
roulette  wheel  sampling  where  the  probability  of  selection  is  proportional  to  the  transform’s  performance  measure. 
The  transformations  are  represented  as  expression  trees.  The  input  patterns  flow  from  the  leaves  of  the  tree  through 
the  operators  to  the  root  of  the  tree.  The  GP  algorithm  exchanges  sub-trees  between  pairs  of  transformations.  In 
Figure  3,  transform  one  (dark  grey)  contains  a  root  and  two  sub-trees  labeled  SI  and  S2  while  transform  two  (stippled 
grey)  consists  of  a  root  and  two  different  sub-trees  labeled  T1  and  T2.  Recombination  forms  two  new 
transformations  where  the  sub-trees  SI  and  T1  are  exchanged  in  the  offspring.  In  addition  to  exchanging  information 
by  recombination,  sub-trees  can  be  added,  deleted,  or  replaced.  Mixing  the  structure  of  expressions  produces  radical 
changes  in  the  operation  of  the  offspring  transform.  This  disruptive  process  facilitates  the  search  for  new  functions. 
The  probability  of  each  type  of  action  is  defined  by  the  user.  Usually,  the  probability  of  mutation  (addition,  deletion, 
replacement)  is  lower  than  the  probability  of  recombination  because  recombination  preserves  larger  pieces  of  the 
structure  and  therefore  is  slightly  less  disruptive  than  mutation. 

EMORPH  uses  a  combination  of  morphological  and  arithmetic  operators  as  a  basis  for  its  functional  transformations. 
Mathematical  morphology  is  a  technique  for  probing  the  structure  of  signals  or  images  using  set  theoretic  operations 
[Serra,  1982;  Haralick  et.  al.  1987].  Each  morphological  operation  is  a  signal-to-signal  transformation  that  applies  a 
probe-like  pattern,  referred  to  as  a  structuring  element,  to  an  input  signal  to  produce  an  output  signal.  By  selecting 
the  correct  algebraic  form  and  structuring  elements,  specific  objects  can  be  isolated  or  enhanced,  but  finding  the 
combination  of  operators  and  probes  to  perform  a  given  task  is  difficult  even  for  an  experienced  morphological 
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Figure  3.  The  GP  process  applied  to  transformation.  Parental  transformations  are  defined  as  expression  trees.  Sub-trees  are 
recombined  to  form  new  transformations.  During  this  process,  some  new  operators  are  introduced  (addition),  some  are  removed 
(deletion),  and  some  sub-trees  are  replaced  with  randomly  generated  sub-trees. 

analyst.  EMORPH  solves  this  problem  using  GP  to  generate,  evaluate,  and  select  suitable  morphological 
transformations  to  accomplish  the  desired  classification.  To  begin  the  evolutionary  process,  the  transformations  that 
form  the  basis  for  the  population  of  recognition  systems  are  initialized  with  small  randomly  generated  expression 
trees.  Each  node  in  the  trees  contains  an  operator  drawn  from  the  set:  erosion,  dilation,  opening,  closing,  band¬ 
opening,  band-closing,  complement,  addition,  subtraction,  minimum,  maximum,  and  threshold.  Most  of  the 
operators  require  some  type  of  parameter.  The  morphological  operators  (erosion,  dilation,  opening,  closing,  band¬ 
opening,  band-closing)  use  structuring  elements  that  are  selected  at  random  from  a  standard  library  consisting  of 
three  basic  shapes  (e.g.  1-D  cross  section  of  a  cone,  a  bar,  a  ball).  A  scale  factor  is  also  included  to  alter  the  size  of 
the  structuring  element.  Some  of  the  arithmetic  operators  (minimum,  maximum,  threshold)  also  use  a  parameter  to 
control  the  behavior  of  the  operation.  These  parameters  are  selected  from  a  uniformly  distributed  random  variable 
(U(0,1)).  Detailed  examples  of  morphological  operations,  library  structuring  elements,  and  the  process  of  generating 
expressions  are  described  by  Zmuda  et  al.  [1992].  When  each  expression  is  generated,  it  is  applied  to  the  input 
signals.  If  it  produces  an  extreme  effect  (e.g.,  the  output  of  the  operation  is  a  constant  value),  it  is  considered  a  lethal 
form  and  discarded.  Transforms  are  generated  until  an  acceptable  pool  is  formed  for  the  second  stage  of  the  process. 
Two  sample  transformations  are  shown  in  Figure  4,  and  the  process  of  generating  transformations  is  summarized  in 
Figure  5. 

The  purpose  of  the  EP  algorithm  is  to  systematically  improve  the  position,  type,  and  number  of  probe  points  in  the 
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Figure  4.  Sample  transformations.  The  result  of  applying  the  transform  to  the  training  set  is  shown  each 
bottom.  The  actual  form  of  the  transform  is  shown  at  the  bottom  of  each  box.  The  columns  represent  the  radar 
signatures  for  six  different  targets.  The  row  are  variation  in  target  elevation(-20,  -18,  ...,-12  degrees). 
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Tranalom  2 


Figure  5.  Overview  of  process  used  to  evolve  transformations. 


convolution  templates.  This  is  accomplished  using  a  controlled  vibration  of  the  position  of  the  Gaussian  points  in 
each  template  followed  by  a  series  of  random  mutations  that  add  and/or  delete  points  (see  Figure  6).  The  EP  phase 
manipulate  each  recognition  system  independently.  To  begin,  a  member  of  the  population  of  capped  transformations 
is  cloned  to  form  an  extended  clonal  population  of  C  capped  transformations.  Each  member  of  the  clonal  population 
then  reproduces  form  an  extended  population.  The  caps  in  the  extended  population  are  subjected  to  random 
variations.  The  amount  of  variation  is  inversely  proportional  to  the  performance  of  the  parental  capped 
transformation  and  controlled  by  Equation  3.  The  value  xjjj.  is  the  central  position  of  the  kth  Gaussian  point  in  the  jth 


H  k  =  k  +  ■  1  - P^i  • 


Equation  3 


convolution  template,  is  the  size  of  the  template,  Xn,in  is  the  location  of  the  left  side  of  the  template,  is  the 
location  of  the  right  side  of  the  template,  (l-pmj)  is  the  complement  of  the  performance  measure  of  the  ith  detector  set, 
and  N(0,1)  is  a  normally  distributed  random  variable  with  a  mean  of  zero  and  a  variance  of  one.  To  update  a  probe 
point's  position,  the  mean  of  the  random  variable  is  set  to  the  value  of  the  initial  position  of  the  probe  point  and  the 

CAP  BEFORE 


CAP  AFTER 

Figure  6.  Evolving  a  convolution  template. 
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variance  is  scaled  to  fall  into  the  range  from  zero  to  half  the  template  size.  Using  this  technique,  when  the 
performance  measure  is  low,  the  potential  extent  of  variation  in  the  position  of  a  probe  point  is  high.  The  potential 
for  variation  is  reduced  as  the  performance  increases.  If  the  performance  reaches  one,  the  potential  for  variation  is 
zero  and  the  template's  point  configuration  is  frozen.  This  approach  to  adjusting  the  structure  of  a  template  is  similar 
to  the  process  of  simulated  annealing  where  gradual  improvements  in  the  population  performance  shut  down  the 
process  of  random  variation  as  a  solution  is  formed. 

The  vibration  process  is  only  capable  of  adjusting  the  position  of  existing  probe  points.  The  second  step  of  the  EP 
phase  is  mutation  that  adds  and/or  deletes  probe  points  to  alter  the  complexity  of  the  templates.  Point  mutation 
occurs  immediately  after  the  template  points  are  vibrated.  The  amount  of  each  type  of  mutation  is  controlled  by  a 
user  selected  probability.  As  a  rule,  if  the  detector  set  is  initialized  with  a  limited  number  of  probe  points,  the 
probability  of  addition  should  be  larger  than  the  probability  of  deletion.  This  will  bias  the  mutation  rate  toward 
addition  and  cause  the  detectors  to  grow  in  complexity. 

In  addition  to  the  type  and  placement  of  the  Gaussian  points  in  a  template,  the  variance  (spread)  of  Gaussian  probes 
change  during  the  evolutionary  process.  The  extent  of  each  probe  point  is  determined  using  an  Equation  4.  The 

^  =  ^min  +  ( 1  •  (^max "  ^min)  Equation  4 

limits  on  the  spread  of  a  single  Gaussian  point  are  set  by  the  user  to  (Omin,  Omax)-  actual  size  of  the  probe  point 
is  then  adjusted  relative  to  the  performance  of  the  ith  capped  transform  (pmj).  If  the  resulting  cap  exhibits  poor 
performance,  the  points  increase  in  size  to  become  less  sensitive  to  the  environment.  If  the  cap  is  very  accurate,  the 
points  become  smaller  and  more  sensitive  to  variations  in  the  signals. 

The  decision  to  accept  a  mutated  cap  is  based  on  a  local  performance  measure.  A  value  for  the  Fisher’s  Discriminant 
[Fisher,  1936]  is  calculated  for  the  original  capped  detector  and  the  mutated  detector.  This  is  a  measure  of  the 
detector’s  ability  to  increase  the  separation  between  the  means  of  the  response  for  each  class  of  target  while 
simultaneously  reducing  the  variance  in  the  response  for  each  class.  If  the  mutated  detector  is  more  discriminating 
than  the  original  detector,  it  replaces  the  parental  unit.  A  few  sample  caps  are  shown  in  Figure  7  and  the  result  of 
applying  these  caps  to  the  transforms  are  shown  in  Figure  8. 

After  all  the  member  of  the  clonal  population  reproduce,  the  C  parental  capped  transform  competes  with  the  C 
offspring  capped  transforms  for  survival  in  a  tournament.  The  top  ranked  C  detectors  are  preserved  and  the 
evolutionary  programming  cycle  begins  again.  After  a  fixed  number  of  EP  cycles,  the  performance  of  the  best 
capped  transform  evolved  during  the  EP  phase  is  compare  to  the  original  parental  capped  transform.  If  the  best 
evolved  capped  transform  is  more  accurate  than  its  parent,  it  replaces  its  parent  in  the  base  population.  This  process 
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Sample  Caps 


Figure  7.  Sample  convolution  caps. 


Figure  8.  Response  Vectors.  The  response  vectors  formed  by  applying  the  cap  shown  in 
the  top-left  corner  of  Figure  7  to  the  training  data  set. 
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EP  Algorithm 


for  each  tranform  Ti  do 

Generate  N  random  caps  for  Ti 

Evaluate  capped  version  of  the  transform  using  Fisher’s  Discriminant 
for  1  to  maxCycles  do 

for  each  cap  Cj  do 

Produce  an  offspring  cap  by  vibrating  the  point 
Mutate  offspring  (add/delete)  points 
Evaluate  the  offspring 

endfor 

Perform  tournament  selection  to  rank  the  full  population  of  caps 
Reduce  population  size  to  N  based  on  ranking 

endfor 

Save  the  best  cap  for  Ti 

endfor 


Figure  9.  EP  Process  used  to  generate  convolution  caps  for  transformations. 

is  repeated  for  each  member  of  the  base  population.  When  the  EP  phase  terminates,  the  base  population  contains 
optimized  sets  of  feature  detectors  consisting  of  convolution  caps  specifically  tuned  to  the  transforms  produced  by  the 
GP  phase.  The  process  of  evolving  capped  transformations  is  summarized  in  Figure  9. 

EMORPH  uses  a  genetic  algorithm  (GA)  to  form  detector  sets.  The  GA  is  responsible  for  recombining  detector  sets 
(see  Figure  10).  Parental  units  are  selected  from  the  base  population  using  roulette  wheel  sampling  where  the 
probability  of  selection  is  proportional  to  the  recognition  system’s  accuracy.  Once  a  pair  of  parents  is  selected,  their 
detectors  are  exchanged  using  a  uniform  crossover.  The  detector  set  is  analogous  to  a  biological  chromosome  and  the 
individual  detectors  are  similar  to  genes.  During  crossover,  each  detector  position  in  the  parental  set  contributes 
some  portion  of  each  of  its  detectors  to  a  pair  of  offspring  detector  sets.  There  is  a  0.5  probability  that  the  first  parent 
places  its  information  in  the  first  offspring  and  the  second  parent  places  its  detector  in  the  second  offspring. 
Similarly,  there  is  a  0.5  probability  that  the  first  parent  places  its  information  in  the  second  offspring  and  the  second 
parent  places  its  detector  in  the  first  offspring.  As  shown  in  Figure  10,  a  parental  unit  simply  copies  its  whole 
detector  (cap  plus  transform)  into  the  selected  offspring. 

The  GA  phase  begins  with  N  detector  sets  and  combines  N/2  pairs  of  parental  sets  to  form  an  extend  population  of  2N 
sets.  Each  member  of  the  extended  population  is  evaluated  using  the  same  procedure  described  for  the  EP  phase.  A 
tournament  selection  process  is  applied  to  rank  the  entire  population  and  the  N  top-ranked  detector  sets  are  preserved 
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Detector 


Figure  10.  Action  of  the  GA  process.  The  GA  takes  pairs  of  detector  sets  and  recombines  components  of  each 
parental  set  to  form  pairs  of  offspring.  This  process  can  result  in  combinations  of  different  caps  on  transforms  or 
whole  detectors  being  exchanged  between  sets. 


for  the  next  cycle  of  the  GA  algorithm.  When  the  GA  phase  is  complete,  each  detector  set  consists  of  combinations 
of  transforms  and  caps  that  proved  useful  in  the  recognition  process.  A  sample  response  matrix  is  shown  in  Figure 
11.  The  rows  represent  the  response  of  individual  detectors.  The  x-axis  is  organized  by  target  classes  (1-5  is  target 
one,  6-10  is  target  two,  etc.).  Notice  how  some  detectors  respond  consistently  to  subsets  of  targets.  The  features 
defined  by  these  detector  readily  form  prototypes  for  a  nearest  neighbor  classifier.  The  overall  flow  of  the  GA  phase 
is  outlined  in  Figure  12. 

To  summarize,  EMORPH  consist  of  four  distinct  stages:  transform  generation  (form  expression  trees  -  GP  phase), 
detector  formation  (capping  expressions  -  EP  phase),  feature  selection  (forming  response  matrices  -  GA  phase),  and 
finally,  creating  a  nearest  neighbor  classifier  to  form  the  complete  pattern  recognition  system.  The  user  can  set 
parameters  to  control  the  duration  of  each  phase  as  well  as  control  parameters  to  influence  the  behavior  of  each  step 
of  the  process.  For  example,  the  user  can  increase  the  sensitivity  of  the  individual  detectors  by  increasing  the  number 
of  passes  through  the  EP  phase  relative  to  the  number  of  passes  through  the  GA  phase.  Alternatively,  the  user  may 
elect  to  spend  more  computational  resources  adjusting  the  average  complexity  of  the  detector  sets  by  increasing  the 
number  of  passes  through  the  GP  phase.  It  is  difficult  to  select  an  appropriate  mixture  of  passes  because  the 
evolutionary  learning  process  is  dynamic.  During  the  early  stages  of  evolution,  it  is  not  likely  that  the  complexity  and 
number  of  detectors  in  the  population  is  suitable  for  the  recognition  task.  If  the  user  arbitrarily  increases  the  number 
of  EP  passes,  the  probe  point  density  will  increase  to  compensate  for  the  lack  of  complexity  in  the  transforms  and 
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GA  Algorithm 


Form  N  random  sets  of  detectors  by  sampling  the  pool  produced  by  the  EP  process 
Evaluate  each  detector  set  by  forming  the  response  matrix  and  classifying  the  training  set 
for  1  to  maxCycles  do 

Generate  a  selection  vector  based  on  classification  accuracy 
for  1  to  maxMatings  do 

Select  pairs  of  detector  sets 

Perform  uniform  crossover  on  detector  sets  to  form  two  offspring 
Muatate  offspring  by  adding/deleting  detectors 
Evaluate  offspring  detector  sets 

endfor 

Perform  tournament  selection  to  rank  the  full  population 
Reduce  population  size  to  N  based  on  ranking 

endfor 


Figure  12.  GA  Process  used  to  select  cooperative  sets  of  feature  detectors. 
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limited  number  of  detectors.  This  will  produce  customized  solutions  that  tend  to  perform  well  on  training  sets  and 
poorly  on  test  sets.  If  the  number  of  GP  cycles  is  too  large,  the  transform  can  become  too  complex  to  compensate  for 
the  inadequate  distribution  of  probe  points  within  each  cap.  A  good  compromise  would  be  to  implement  an  adaptive 
feedback  mechanism  to  control  the  duration  of  each  phase  and  adjust  parameters  relative  to  the  contribution  of  each 
phase  throughout  the  evolutionary  process. 

EXPERIMENTAT.  nR.STGN 

To  demonstrate  how  EMORPH  generates  pattern  recognition  systems,  the  results  of  a  target  recognition  task  in  high 
range  resolution  radar  are  presented.  Specifically,  the  problem  is  to  classify  a  set  of  airborne  targets  from  their  radar 
cross  sections.  For  this  experiment,  a  sample  of  60  radar  signatures  were  extract  from  a  large  database  of  signals. 
Each  radar  signature  is  one  view  of  a  target  at  a  specific  azimuth  and  elevation.  The  selected  data  set  contains  six 
targets  at  azimuth  25°  and  elevations  that  range  from  -20°  to  -1 1°  in  increments  of  1°.  Thus,  there  are  one  ten 
samples  of  each  target  in  the  data  set.  These  were  divided  into  a  training  set  of  30  radar  signatures  that  contain  5 
samples  of  each  target  and  a  test  set  of  30  cross  sections  that  also  contain  5  samples  of  each  target.  The  data  was  not 
placed  in  the  sets  at  random.  The  training  set  contains  all  targets  with  odd  values  of  azimuth  while  the  test  set 
contains  the  remaining  signatures.  This  amounts  to  placing  every  other  signature  in  the  view  volume  (azimuths  x 
elevations)  into  one  set  and  the  remaining  signatures  into  the  other.  Notice  the  signatures  have  been  normalized  into 
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•  Data  set  consists  of  60  signals 

•  Six  targets  at  AZ=[-25,- 1 6]  at  EL=-20 

•  30  targets  for  training  and  30  target  for  testing 

Figure  13.  Training  and  Test  Data  Sets. 
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128  range  bins  with  the  maximum  value  (255)  placed  in  bin  63.  Looking  down  the  column  of  data  it  is  easy  to  see 
there  are  characteristic  features  in  each  target  that  persist  through  a  few  degrees  of  change  in  elevation,  but  then 
rapidly  disappear.  Also  note  the  similarity  in  the  signatures  between  targets  making  the  classification  task  quite 
difficult. 

An  E-MORPH  learning  cycle  consists  of  thirty-five  GP  sub-cycles,  followed  by  35  EP  sub-cycles,  followed  by  150 
GA  sub-cycles.  To  begin  the  GP  phase,  a  population  of  100  transforms  was  generated  at  random.  Each  transform 
was  initialized  with  one  to  three  operators.  Each  pass  through  the  GP  algorithm  produced  100  offspring  that  were 
evaluated  using  the  local  performance  measure.  Then  tournament  selection  was  used  to  reduce  the  population  back  to 
100  transforms.  The  EP  phase  was  then  applied  to  form  a  pool  of  feature  detectors.  The  initial  caps  were  generated 
with  one  or  two  randomly  placed  Gaussian  points.  The  performance  of  each  capped  transform  was  evaluated  by 
computing  the  Fisher  discriminant  for  the  response  vector  produced  by  applying  the  detector  to  the  training  set.  A 
pass  through  the  EP  phase  consists  of  processing  each  member  of  the  base  population  of  100  transforms.  Each 
member  of  the  base  population  was  used  to  produce  10  clones  that  are  then  mutated  to  produce  an  additional  10 
recognition  systems.  This  extended  population  of  20  caps  was  pruned  back  to  10  individual  using  tournament 
selection.  The  process  was  repeated  five  times  and  the  best  recognition  system  found  competed  to  replace  its 
ancestor  in  the  base  population.  The  GA  phase  started  with  the  base  population  of  10  recognition  systems.  Copies 
of  these  base  systems  are  mutated  and  recombined  to  create  an  extended  population  of  20  systems  (10  parents  +  10 
offspring).  The  extended  population  was  ranked  using  tournament  selection  and  the  top  10  systems  were  saved  to 
start  the  next  GA  sub-cycle. 

For  the  EP  phase,  the  probabilities  of  vibration,  addition,  and  deletion  are  0.6,  0.3, 0.1  respectively.  When  a  Gaussian 
point  is  added  to  a  cap,  there  is  a  0.67  probability  that  the  point  is  positive  and  a  0.33  probability  that  the  point  is 
negative.  The  range  of  a  Gaussian  probe  point  is  4  to  12  range  bins  (i.e.  pixels)  and  the  maximum  weight  of  a  point  is 
limited  to  the  range  of  1  to  3.  In  the  GP  phase  there  is  a  0.3  probability  that  a  transform  is  mutated  and  a  0.7 
probability  that  a  pair  of  transforms  is  recombined.  If  a  set  is  selected  to  undergo  mutation,  there  is  a  0.2  probability 
that  an  individual  transform  passes  to  the  offspring  unchanged;  a  0.7  probability  that  the  transform  is  extended  with  a 
randomly  selected  operator,  and  a  0.1  probability  that  a  random  tree  is  added  to  the  transform. 

The  average  recognition  accuracy  for  the  population  produced  during  the  evolutionary  learning  process  is  shown  in 
Figure  14.  The  performance  is  displayed  at  the  end  of  each  GA  cycle.  The  average  training  set  recognition  accuracy 
rises  from  a  low  of  38%  in  generation  0  to  69%  in  generation  150.  Notice,  the  curve  is  not  monotonically  increasing. 
This  is  because  the  uniform  crossover  used  in  the  GA  algorithm  can  extend  and/or  shrink  detector  sets  producing 
disruptive  effects.  Also,  tournament  selection  does  not  guarantee  that  the  most  accurate  recognition  systems  will 
survive.  The  overall  trend  shown  in  this  graph  suggests  that  the  selection  process  is  locating  increasingly  accurate 
sets  of  features.  Notice  there  is  more  variation  in  the  test  set  performance  than  the  training  set  performance.  This  is 
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Average  Accuracy 


Figure  14.  Average  recognition  accuracy  for  the  training  and  test  set 


Maximum  Accuracy 


90%  or  27/30 

77%  or  23/30 


Figure  15.  Recognition  accuracy  of  the  best  pattern  recognition 
system  for  the  training  and  test  data  sets. 
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unfortunate,  because  it  suggests  that  the  features  selected  for  the  recognition  systems  are  not  generalizing.  The 
performance  of  the  best  recognition  system  in  each  generation  is  shown  in  Figure  15.  The  accuracy  of  the  best 
system  rises  from  approximately  57%  accuracy  on  the  training  and  22%  accuracy  on  test  data  sets  using  3  detectors  to 
a  maximum  level  of  100%  on  the  training  set  and  90%  on  the  independent  test  set.  The  best  recognition  system  uses  9 
detectors  (Figure  16).  The  system  appears  to  converge  to  rapidly  producing  a  best  training  set  score  of  100%  in 
generation  50.  This  sharply  reduces  the  amount  of  exploration  that  can  occur  in  future  generations.  This  is  partly  due 
to  the  same  population  size  used  in  this  experiment.  It  is  interesting  to  note  that  several  recognition  systems  appear 
with  test  set  accuracies  approaching  100%,  but  these  systems  do  not  produce  the  top  training  scores.  This  suggests 
that  a  two-level  training  set  might  improve  performance.  One  set  would  be  used  to  form  the  recognition  system  and  a 
second  set  would  then  be  used  to  determine  survival  based  on  how  well  the  system  classified  the  secondary  training 
set. 

DISCUSSION 

E-MORPH  successfully  generated  a  pattern  recognition  system  to  classify  high  range  resolution  radar  signatures. 
The  evolved  recognition  system  achieved  a  classification  accuracy  of  100%  when  applied  to  a  training  data  set 
consisting  of  30  radar  signatures  (5  samples  of  six  targets)  and  90%  accuracy  on  an  independent  set  of  30  additional 
signatures.  The  best  recognition  system  contained  nine  feature  detectors  composed  of  primitive  morphological  and 


Number  of  Feature  Detectors 


Figure  16.  Number  of  feature  detector  used  in  the 
average  and  most  accurate  recognition  systems. 
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arithmetic  operators  capped  by  a  special  convolution  template  containing  an  evolved  distribution  of  Gaussian-shaped 
probe  points.  The  response  of  these  detectors  are  processed  by  a  simple  nearest  neighbor  classifier  that  labels  each 
signature.  The  use  of  morphological  operators  in  the  construction  of  primitive  feature  detectors  allows  EMORPH  to 
evolve  wavelet-like  transformations  that  eliminate  noise  from  the  signatures  and  suppress  information  at  various 
spatial  frequencies  to  facilitate  the  process  of  classifying  targets. 


Although  EMORPH  achieves  excellent  recognition  results,  its  performance  can  be  improved.  Inspection  of  the 
evolved  feature  detectors  suggests  that  various  redundant  sub-expressions  within  the  detector  transformations  can  be 
eliminated  to  accelerate  the  evolutionary  search  process.  This  also  implies  that  adjusting  the  library  of  operators  and 
parameters  used  to  grow  feature  detectors  may  improve  both  accuracy  and  the  robustness  of  the  evolved  recognition 
systems.  In  addition,  EMORPH’s  control  parameters  were  not  carefully  tuned  for  this  specific  problem. 
Consequently,  even  better  performance  can  be  achieved  in  future  experiments  by  adjusting  the  library  of 
morphological  operators,  structuring  elements,  and  distribution  of  the  computation  resource  among  the  different 
phases  of  the  evolutionary  process. 

The  techniques  used  in  EMORPH  are  not  tied  to  radar  signal  processing.  The  approach  is  generic  and  can  readily 
transition  to  many  different  problems  in  automatic  target  recognition.  No  single  approach  solves  all  problems  in 
automatic  target  recognition,  EMORPH  represents  one  viable  alternative.  The  solutions  generated  using  our 
evolutionary  learning  algorithm  are  quite  different  than  the  solutions  produced  by  human  experts.  This  indicates  that 
human  experts  may  not  be  using  all  of  the  available  information  to  develop  robust  pattern  recognition  systems.  In 
future  work,  we  hope  to  tune  EMORPH,  perform  a  more  definitive  set  of  experiments,  and  explore  the  possibility  of 
combining  human  expertise  with  the  evolutionary  search  process  to  access  alternative  designs.  This  hybrid  approach 
to  design  may  ultimately  produce  recognition  systems  with  performance  superior  to  any  in  use  today. 
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AUTOMATED  MODULAR  FDCTURE  PLANNING  FOR  VIRTUAL  MATERIALS 
PROCESSING:  GEOMETRIC  ANALYSIS 


Yiming  (Kevin)  Rong 
Associate  Professor 
Manufacturing  Systems  Program 
Southern  Illinois  University  at  Carbondale 


Abstract 

Attendant  Processes  such  as  fixture  and  die  design  are  often  a  necessary  but  tinie 
consuming  and  expensive  component  of  a  production  cycle.  Coupling  such  attendant 
processes  to  product  design  via  feature-based  CAD  will  lead  to  more  responsive  and 
affordable  product  design  and  redesign.  In  the  context  of  on-going  researeh  in  automating 
fixture  configuration  design,  this  report  presents  a  fundamental  study  of  automated  fixture 
planning  with  a  focus  on  geometric  analysis.  The  initial  conditions  for  modular  fixture 
assembly  are  estabUshed  together  with  needed  relationships  between  fixture  components 
and  the  workpiece  to  be  analysed.  Of  particular  focus  is  the  design  of  alternative  locating 
points  and  components,  together  with  example  3-D  fixture  designs. 
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AUTOMATED  MODULAR  FKTURE  PLANNING  FOR  VIRTUAL  MATERIALS 
PROCESSING:  GEOMETRIC  ANALYSIS 

Yiming  (Kevin)  Rong 


1.  Introduction 

Global  competition  has  forced  U.S.  manufacturers  to  reduce  production  cycles  and 
Product  design  agility.  Generally,  a  manufacturing  process  is  uncoupled  and  divided  into: 
product  design,  process  design  (selection,  routing  and  tooling),  and  assembly.  Obvious 
and  continual  advances  in  computer-aided  design  (CAD),  computer-aided  process 
planning  (CAPP)  and  computer-aided  manuf^turing  (CAM)  are  enabling  more  multi¬ 
disciplinary  design.  However,  computer-aided  tooling  (CAT),  which  is  a  critical  part  of 
process  design  and  a  bridge  between  CAD  and  CAM  together  with  CAPP,  has  been  least 
addressed  and  remains  a  missing  link. 

As  a  consequence  of  evolving  CNC  technology,  specifically  re-usable  objects 
called  features  coupling  shape  and  process  (milling,  drilling,  etc.)  to  generate  machine 
specific  NC  code,  workpiece  setup  and  associated  fixtuiing  has  become  the  process 
bottleneck.  To  address  this  bottleneck,  research  and  development  of  flexible  fixturing, 
including  modular  fixturing  technology,  has  received  continued  support.  Modular  fixture 
components  enable  a  large  number  of  configurations  to  be  derived,  disassembled  and  re¬ 
used.  However,  modular  fixture  design  is  a  geometrically  complex  task  and  such 
complexity  impedes  widespread  application  of  modular  fixtures.  Development  of  an 
automated  modular  fixture  design  system  is  needed  to  simplify  process  design  of  more 
affordable  products.  This  research  focuses  on  a  geometric  analysis  for  automated  modular 
fixture  planning  which  is  inspired  by  several  previous  research  in  this  area,  especially  a 
modular  fixture  synthesis  algorithm  [1]  and  an  automated  fixture  configuration  design 
methodology  [2]. 
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Previous  Re.SP.arrh 

Fixture  design  involves  three  steps:  setup  planning,  flxture  planning,  and  fixture 

configuration  design  [2],  Setup  planning  research  has  been  addressed  in  the  context  of 

CAPP  [3, 4. 5],  Seminal  work  in  computer-aided  fixture  design  (CAFD)  focused  on 
fixture  planning: 

*  a  method  for  automating  fixture  location  and  clamping  [6]; 

‘  an  algorithm  for  selection  of  locating/clamping  positions  providing  maximum 
mechanical  leverage  [7]; 

*  kinematic  analysis  based  fixture  planning  [8, 9];  and 

*  rule-based  systems  to  design  modular  fixtures  for  prismatic  workpieces  [10, 1 1], 
But  with  respect  to  previous  work  on  automating  the  configuration  of  workpiece 

fixtures,  i.e.,  automated  fixmre  configuration  design  (AFCD),  little  can  be  found.  Fixture 
design  depends  upon  critical  locating  and  clamping  points  on  workpiece  surfaces,  for 
which  fixture  components  can  be  selected  to  hold  the  workpiece  based  on  CAD  graphic 
functions  (12).  A  2-D  modular  fixture  component  placement  algorithms  has  been 
developed  [13],  In  addition,  a  method  for  automating  design  of  the  configuration  of  T- 
slot  based  modular  fixturing  components  has  been  developed[14].  A  prototype  AFCD 
system  has  been  developed,  including  dime  core  modules:  fixmre  unit  generation  and 
selection  module,  fixture  unit  mount  module,  and  interference  checking  module  [2], 

Assembly  relationships  between  fixture  components  have  also  been  defined  and 
automatically  established  [15], 

Nearly  all  the  CAFD  researchers  admit  that  workpiece  geometry  is  the  pivotal 
factor  in  a  successful  CAFD  sysrem.  Since  the  geometry  of  workpieces  may  vaiy  greaUy, 
many  researchers  in  CAFD  consider  only  regular  woricpieces,  i.e.,  woikpieces  suitable  for 
3-2-1  locating  method,  niere  have  been  some  attempts  towaid  handling  more 
compucated  workpiece  geometries  as  in  reference  [16].  However,  their  results  are  only 
applicable  to  some  specific  geometry,  i.e..  regular  polygonal  prisms. 
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Review  of  Brost-Goldherg  Algorithm 

Recently,  research  in  modular  assembly  based  on  geometric  access  and  assembly 
analysis  has  gained  considerable  attention.  Reference  [1]  presented  a  “complete” 
algorithm  for  synthesizing  modular  fixtures  for  polygonal  workpieces  and  reference  [17] 
explored  the  existence  of  modular  fixture  design  solutions  for  a  given  fixture  configuration 
model  and  a  workpiece.  Fixture  foolproofing  for  polygonal  workpieces  was  studied  [18], 
and  partially  employed  the  approach  in  reference  [1].  Reference  [19]  presented  a 
framework  on  automatic  design  of  3-D  fixtures  and  assembly  pallets,  but  no  detailed 
design  methodology,  procedure  and  results  were  provided. 

In  reference  [1],  an  algorithm  which  is  called  the  Brost-Goldberg  algorithm  was 
presented  for  synthesizing  planar  modular  fixtures  for  polygonal  workpieces.  The  basic 
assumptions  were  that  a  workpiece  can  be  represented  with  a  simple  polygon,  locators  can 
be  represented  as  circles  with  identical  radius  less  than  half  the  grid  spacing,  the  fixturing 
configuration  will  be  three  circular  locators  and  a  clamp,  the  base  plate  is  infinite,  and  all 
the  contacts  are  frictionless.  In  addition  to  polygonal  workpiece  boundaries  a  set  of 
geometric  access  constraints  are  provided  as  a  list  of  polygons  with  clamp  descriptions 
and  a  quality  metric.  The  output  of  the  algorithm  includes  the  coordinates  of  the  three 
locators,  the  clamp,  and  the  translation  and  rotation  of  the  workpiece  relative  to  the  base 
plate.  The  implementation  of  the  algorithm  is  as  follows  per  step  1 : 

1.  The  polygonal  workpiece  and  geometric  access  constraints  are  transformed  by 
extending  the  workpiece  by  the  radius  of  the  locators  which  are  treated  as  ideal  points 
(Figure  1)  [1]. 

2.  All  candidate  fixture  designs  are  synthesized  by  enumerating  the  set  of  possible 
locator  setups.  The  possible  clamp  locations  are  also  found  with  each  locator  setup  and 
clamp  location  specifies  a  unique  fixture. 

3.  The  set  of  candidate  fixtures  are  then  filtered  to  remove  those  that  cause 
problems,  i.e.,  coUision.  The  survivors  are  then  scored  according  to  the  quality  metric. 

In  step  2  placement  of  three  circular  locators  on  the  base  plate  are  evaluated  while 
translating  and  rotating  the  workpiece  relative  to  the  base  plate.  An  algorithm  was  also 
presented  to  find  all  combinations  of  the  three  edges,  where  two  of  them  may  be  identical. 
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on  the  polygon  with  a  satisfactton  of  hole-alignment  conditions  with  the  base  plate  (Figure 
2)  [1].  For  each  set  of  locators  and  associated  contact  edges,  consistent  workpiece 
configurations  or  workpiece  positions  are  calculated.  All  the  possible  clamp  positions  are 
then  enumerated  based  upon  the  constraint  analysis  of  the  constructed  force  sphere. 

Tlie  algorithm  is  caUed  a  “complete”  algorithm  for  planer  modular  finure  design 
because  it  guarantees  finding  aU  possible  planner  future  designs  for  a  specific  polygonal 
woikpiece  if  they  do  exist.  However,  the  major  limitations  of  the  method  are: 

1.  Only  polygonal  workpieces  are  considered,  i.e.,  no  curved  surfaces  are  allowed 
in  the  woApiece  geometry.  In  reality,  many  fixture  design  cases  include  cylindrical 
surfaces,  or  circular  arcs  in  2-D  representations. 

2.  Only  circular  locating  pins  with  uniform  radius'  are  considered  in  the  algorithm. 

In  each  modular  fixture  system,  there  are  some  other  types  of  locatom  available  and 
widely  used  in  fixture  designs. 

3.  The  algorithm  only  considers  2-D  workpieces.  In  practice,  it  can  be  appUed 

only  for  prismaUc  woikpieces  having  small  height,  i.e.,  3-D  fixture  design  problem  is  a 
great  challenge. 

4.  There  are  some  criteria  necessary  for  locating  and  clamping  design  in  addiUon 
to  geometric  considerations  including:  locating  error,  accuracy  relationship  analysis, 
accessibility  analysis,  and  other  operational  conditions. 

5.  Clamp  location  planning  is  weak  without  the  consideration  of  friction  forces, 
which  needs  to  be  further  improved. 

In  this  report,  modifications  and  extensions  to  the  modular  fixture  synthesis 
algorithm  are  presented  with  regard  to  limitations  1, 2,  and  3  above.  Discussion  of 
limitations  4  and  5  will  be  presented  in  a  separate  report  [20]. 

2.  Geometric  Condiiinnc 

A  pnsmatic-workpiece  Is  typically  regarded  as  a  2-D  workpiece  including  a  set  of  edges 
such  as  line  segments  and  arcs,  which  are  candidate  locating  edges.  The  locating  and 
clamping  design  problem  becomes  one  of  finding  a  group  of  three  locating  edge 
combinaUons.  For  an  explicit  expression,  iet  us  define  the  set  of  expanded  boundary  edges 
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of  the  workpiece  P  as: 

EBE(P)  =  {ei  I  ei  e  line  segments;  i  e  NE}  (1) 

where  NE  is  the  number  of  candidate  edges. 

All  combinations  of  three  edges,  two  of  which  may  be  identical,  on  the  polygon  are 
enumerated  as: 

Triplets(P)  =  { (ej.  Cj,  et)  |  ej,  Cj,  et  €  EBE(P),  3  (a,  b)  <z  (i,  j,  k),  a  b}  (2) 

Locator  centers  are  designed  to  contact  with  edge  combinations  ec  =  (ei,  esj,  et)  ( 
Triplets  (P)).  Without  loss  of  generality,  it  can  be  assumed  that  ei  contacts  with  a  locator 
Li  at  the  origin  of  the  base  plate  lattice  based  on  the  assumption  of  an  infinite  base  plate. 
By  translating  and  rotating  Oi  about  the  origin,  ejj  sweeps  out  an  annulus  centered  on  the 
origin,  with  inner  and  outer  diameter  equal  to  the  minimum  and  maximum  distance 

between  ei  and  Cj.  The  position  set  of  the  locator  contacting  Cj  should  be  within  the  swept 
annulus  as 

P2(ei,  Cj)  =  { p2(x,  y)|  min-dist(ei,  Cj)  <  dist(p2,  p,)  <  max-dist(ei,  ej)}  (3) 
where  Pi  =  origin  of  base  plate  lattice. 

Each  p2  is  evaluated  for  selection  as  the  second  locator  L2  in  contact  with  ej.  If  Li 
contacts  Ci  and  L2  contacts  a  third  locator  L3  in  contact  with  Ck  must  be  pairwise 
consistent  with  both  ei  and  Cj.  The  envelope  containing  the  region  swept  by  ek  maintaining 
contact  with  the  first  two  locators  can  be  easily  determined  by  independently  considering 
each  pair  as 

PaCCi,  Oj,  Ck)  =  { P3(x,  y)  |p3(x,  y)  e  P2(ek,  Ci)  n  P2(ek,  Cj)} ,  (4) 

which  is  the  same  as  presented  in  reference  [1]. 

Assembly  Relationship  Analysis 

From  the  above  discussion,  it  has  been  shown  that  determining  the  positions  and 
orientations  of  modular  fixture  components  can  be  simpUfied  as  finding  geometric  entities 
such  as  line  segments  or  arcs  on  the  workpiece  passing  ideal  points  on  the  base  plate  after 
moving  (translating  and  rotating)  the  workpiece  relative  to  the  base  plate. 
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As  shown  in  Figure  3,  the  relative  position  between  the  workpiece  and  the  base 
plate  can  be  represented  by  the  relaUon  of  the  woricpiece  and  the  base  plate  coordinate 
systems  whtch  are  expressed  as  xa)Y.  and  X.OY.  respectively.  Basically,  there  are  three 
locator-workpiece  contact  situations  as  shown  in  Figure  4: 

Line  segment  contacts  with  a  circular  locator. 

*  Arc  contacts  with  a  circular  locator. 

*  Arc  contacts  with  a  line. 

When  a  locating  edge  I.  on  the  wo*pie«  is  required  to  pass  a  point  P,  on  flte  base 

plate.  i.e..  the  locator  center  needs  to  be  aligned  to  a  tapped  (or  pin)  hole  on  the  base 
plate,  Li  can  be  expressed  by 

SiYw  +  ti  =  0 
Pi  can  be  expressed  as  [2] 

Pi-*  (Xbi.ybi) 

(6) 

where  2)  „  V-  M  XT  1 

yj,j  =  Tv  ’  ’*  1*  2, ...» N,  and  T  is  the 

spacing  increment  between  the  taped  (or  pin)  holes  on  the  base  plate. 

The  workpiece  is  assumed  to  be  translated  by  (x.  y)  and  rotated  by  6  relative  to  the 

base  plate.  To  simplify  the  calculation,  an  inverse  transform  is  considered  by  bolding  the 

workpiece  fixed,  and  moving  the  base  plate  by  (-x.  -y.  -8).  Then  P,(x„.  y„)  is 
transformed  to 

((x„  -x)cose  +  (y„  -y)sine.  (y^  -y)cose-(x,.  -x)5ine).  (7) 

Thus,  the  condition  for  the  modular  assembly  can  be  described  as: 

-x)cose  +  (y„  -y)sine]+s,((y.,  -y)cose-(x,  -x)sine]+t,  =0  (8). 

For  a  specific  workpiece,  its  geometry  shape  is  fixed  which  means  the  equation  of 

the  hne  is  fixed.  i.e..  r,.  s,  and  t,  are  constant.  The  assembly  points  are  given,  which  means 

and  y„  are  constanL  There  will  be  three  equations  to  solve  three  unknowns  x.  y  and  8 

If  an  circular  locator  contacts  with  an  arc  centered  at  0,(u..  Vo)  with  radius  R.  the  arc 
can  be  represented  as: 

|PiOi|  =  R,  or  (x,,-Uo)'-i-(y^-v„)2  =r2 
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The  contact  equation  will  be: 

(-.  -(x„  -x)cose-(y„  -yysMf  -Kv.  -(y,  -y)cose+(.„  -x)sine)^  =  (10, 

When  an  an=  centered  a.  0,(„.  v.)  wid.  >he  ntdina  R  oonmcts  a  line-conrac.  locator 

xnchaaV.padorhalf-Veewh,chhaaanh.h„eedseAB,thethWaitnati™ 

Axxunte  ft.. ,  are  the  extreme  directional  angle*  of  P,0,  that  make*  Ute  ate  still 
maintain  contact  with  AB  (Figure  4.c).  Therefore, 

P.b,  =  R,a+e,  |3„„,=p„„  +  e 

Oi(xi,  yi)  =  (UocosO  -  vosinO  +  x,  vocosS  +  uosind  +  y) 

Since  distanceco,.  AB)  =  R,  the  Hxturing  condition  becomes: 

[r,(u.cos0-v.sine+x)  +  s,(v„cos8  +  u.sine+y)  +  ,,y=Ri(,^r^3^r, 

where  line  AB  in  base  plate  coordinate  system  is  represented  as: 

*'i*b+Siyb+tj=0 

(12) 

and  PiOi  should  be  within  to 

position  Of  a  planner  workpiece  requires  three  pamme^^ 

coordtnares  as  weu  as  the  rotadonal  angle  eofthewotkpiece  coordinate  system^  When 

e  wor  piece  is  placed  into  the  fixture,  it  should  contact  with  the  three  locatom  with 
edges  numbered  j,  k  and  1.  Each  contact  will  provide  an  equation  conceming  the 
wor  piece  locauon  x,  y  and  6.  Eqs.  8  and  10  can  be  generafiy  presented  as: 
y,  6)  =  0,  i  =j,  k and  1. 

™““"‘“"^<>f‘h«wo*Piececanberepresentedbyagroupofdifferentiable 

nncuons  m  terms  of  workpiece  coordinates  (x,  y,  0)  reiaUve  to  the  base  plate  because  of 
the  translation  and  rotation  of  the  workpiece: 

Gi(x,  y,  e)=o,  i  =  l,2,....,n 

,  K  (14) 

here  n  represents  the  number  of  candidate  locating  points. 

Vagueness  of  2-D  .^Snintinnc 

Once  Uie  workpiece  is  positioired,  the  orientaUon  should  be  uniqwe.  Solving  the  three- 
equauon  set  (Eq.  13)  may  provide  a  soluUon,  however,  it  is  also  possible  that  no  solution 
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or  infinite  soluUons  are  possible.  The  no-solution  situaUon  means  the  third  locator  has 
been  chosen  from  an  image-locus,  i.e..  aU  possible  solutions  exist  in  the 
pseudo-locus,  but  the  pseudo-locus  wUl  also  provide  the  position  of  the  third  locator 
which  is  not  possible.  In  other  situations,  there  are  an  infinite  number  of  solutions,  e.g., 
when  all  three  edges  are  parallel  or  when  the  three  contact  normals  meet  at  a  single  point 

(Figure  5).  These  cases  should  be  discarded  since  they  do  not  constrain  the  workpiece  to 
a  unique  location. 

In  order  to  obtain  the  necessaiy  condiUons  for  a  unique  solution,  assume  that  the 
workpiece  can  be  positioned  while  contacting  all  three  locators  in  a  position  (x„,  y„,  e„)) 
with  a  disturbance: 


Gi(x  +  Ax,  y  +  Ay,  0  +  Ae)  =  O,i=j,k, andl, 

Gi  (X,  y,  9).r^Axr-^Ay+^Ae  =0 

For  a  stationary  locating, 

^Gj  X  3Gj  ^  0Gi 
_Ax+— Ay-r-^Ae  =0,i=j,k,l. 


(15) 

(16) 


Therefore,  the  condition  for  the  equation  set  to  have  single  solution  is: 


(17) 


For  a  valid  solution,  it  is  also  important  to  consider  the  workpiece  tolerances. 
When  the  geometric  dimensions  of  the  workpiece  vary  in  a  certain  range,  the  locating 
contacts  should  be  maintained.  Similar  analysis  can  be  conducted,  but  the  specifics  wiU  be 
presented  in  a  forthcoming  report  [20].  Some  other  valuable  discussion  on  similar 
problems  can  be  found  in  reference  [21]. 


aGj 

8x 

ay 

ae 

3G^ 

aG^ 

aG^ 

dx 

ay 

3G| 

aG, 

aG, 

dx 

ay 

ae 

18-10 


3.  Assembly  Analysis 

In  this  section,  various  locators  and  clamps  are  considered  in  fixture  planning.  In  order  to 
apply  the  fixture  planning  algorithm  discussed  in  the  previous  section,  the  geometric 
analysis  for  workpiece  boundary  expansion  should  be  performed  for  actual  locators  and 
clamps.  Generally,  there  are  two  types  of  locating  edges  for  2-D  workpiece 
geometry,  line  segments  and  arcs  which  may  lie  in  either  internal  or  external  contours. 
Several  locator  types  are  used  for  side  locating,  including  round  locating  pins  (Figure  6.a), 
locating  towers  (Figure  6.b),  adjustable  stops  (Figure  6.c),  half-Vees  (Figure  6.d),  V-pads 
(Figure  6.e),  round  hole  pins  (Figure  6.f)  and  diamond  hole  pins  (Figure  6.g). 

If  the  locating  edge  is  a  line  segment,  a  round  locating  pin,  locating  tower  and 
adjustable  stop  may  be  used.  For  an  arc  segment,  half- Vee  and  V-Pad  are  considered  first 
However,  round  locating  pin,  locating  tower,  and  adjustable  stop  may  be  also  used  for  arc 
edge  contacts.  Generally,  locating  a  2-D  workpiece  requires  limiting  three  degrees  of 
freedom  (DOF):  two  translation  and  one  rotational.  Three  line  or  arc  edges,  two  of  which 
may  coincide,  should  be  selected  for  locating  purposes.  Thus,  a  locator  configuration 
should  be  considered  for  sundry  combinations.  Table  1  shows  the  possible  locator 
configurations  with  assigned  preference  and  provides  criteria  for  preliminary  selections  of 
locators  and  clamps. 

It  was  shown  in  reference  [17]  that  the  three  circular  locating  pin  configuration 
were  not  universal  for  arbitrary  2-D  workpiece.  Indeed,  there  exist  some  workpieces 
which  can  not  be  fixtured  using  this  configuration,  and  therein,  the  type  of  locator  may  be 
changed.  An  alternative  may  involve  the  use  of  adjustable  stops  with  adjustable  contacting 
lengths.  The  distance  from  the  contact  point  to  the  locator  center  may  be  larger  than  half 
of  the  base  plate  grid  distance,  which  may  greatly  improve  the  locating  capability. 

Locating  geometric  analysis  is  based  on  the  geometric  constraints  imposed  on  the 
workpiece  and  locator  position.  Here  locators  must  maintain  contact  with  specific 
locating  edges  on  the  workpiece.  The  modular  fixture  assembly  requires  the  locators  to  be 
assembled  through  holes  in  the  base  plate.  For  the  2-D  situation,  the  assembly  process  is 
to  find  the  suitable  assembly  holes  in  the  base  plate  which  can  locate  the  workpiece. 
Following  are  several  cases  on  how  to  find  possible  locator  positions. 
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Locators  used  for  line  segments  are  first  discussed,  such  as  locating  tower  and  adjustable 

stop  (locator  b  and  c  in  Figure  6),  as  shown  in  Figure  7.  Locating  towers  can  be  treated 
as  smaller  circular  locators  whose  radius  r  is 

r  =  distance  Oocator  center,  locating  edge)  ( 1 

However,  it  should  be  noted  that  for  locating  towers,  the  possible  contact  region 
between  the  locating  tower  and  locating  edge  should  be  reviewed  to  ensure  the  functional 
stability  of  the  locating  tower  (Figure  8): 


where  U  is  the  effective  locating  edge  length,  L  is  original  length  of  the  locating  edge, 
and  d  is  the  length  of  the  locating  surface. 


The  adjustable  stop  can  be  treated  as  a  circular  locator  with  a  radius  r  as  a  variable 
min-acting-distance  <  r  <  max-acting-distance  (20) 

Using  such  a  geometric  representation,  input  geometry  transformation  may  be  used 

to  perform  geometric  analysis  by  expanding  the  corresponding  locating  edges  by  the 
equivalent  radius. 


If  the  locator  configuration  employs  circular  round-pin  or  diamond-pin  to  locate  with 
smaU  internal  holes,  it  is  easy  to  do  assembly  analysis  since  the  center  of  hole  and  hole  pin 
should  be  aligned.  As  shown  in  Figure  9.a.  the  first  step  of  assembly  is  to  align  the 
diamond-pin  with  a  hole  on  the  base  plate.  Then  the  woApiece  has  only  one  rotaUonal 
DOF  to  fmd  the  suitable  assembly  holes  for  other  locating  edges.  OeneiaUy.  adjustable 
locators  may  be  used  to  ensure  the  availability  of  assembly  holes  for  the  other  two  locating 
edges.  The  round-pin  application  is  shown  in  Figure  9.b. 


If  two  small  holes  are  employed,  either  the  distance  between  the  two  holes  has  to 
be  standard  as 


Oi,02=kT 


(21) 
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where  Oi,  02  are  centers  of  two  holes  and  T  is  the  base  plate  grid  distance.  If  this 
condition  is  not  valid,  adjustable-bar  support  should  be  used  near  the  bottom  of  the 
workpiece  for  one  hole  locator  to  ensure  the  assembly  of  the  hole  locator,  which  then 
becomes  a  3-D  locating  problem  (Figure  10). 

Case  3:  Arc  segment  locating  using  circular  locators 

When  the  locating  edge  is  an  arc,  input  geometry  transformation  can  also  be  used  by 
expanding  the  arc  through  the  equivalent  radius  of  the  locator  in  the  direction  of  external 
normal.  It  is  applicable  to  both  external  and  internal  arcs.  The  locus  analysis  is  almost  the 
same  as  those  presented  for  line  segment  situation.  The  major  difference  lies  in  calculating 
the  workpiece  location  and  orientation. 

As  described  in  section  2,  when  the  first  circular  locator  is  placed  in  the  base  plate 
origin,  by  translating  and  rotating  Cj  about  the  origin,  ej  sweeps  out  an  aimulus  centered 
on  the  origin,  with  inner  and  outer  diameter  equal  to  the  minimum  and  maximum  distances 
between  Ci  and  ej.  The  position  set  of  the  locator  contacting  e^j  should  be  within  the  swept 
annulus  as 

P2(ei,  ^)  =  { p2(x,  y)  I  min-dist(ei,  e^)  <  dist(p2,  pO  <  max-dist(ei,  ^)}  (22) 

where  Pi  =  origin  of  base  plate  lattice. 

It  should  be  noted  that  the  ei  and  Cj  could  be  either  line  or  arc  segments  when  using 
circular  locators.  However,  in  the  case  of  applying  other  types  of  locators  with  arc  edges, 
such  as  V-pad  and  half-Vee,  the  way  to  find  locator  positions  with  hole  alignment 
relationships  needs  to  be  further  studied. 

Case  4:  Locating  with  V-pad 

In  Table  1,  when  the  locating  triplet  is  composed  of  one  line  segment  e2  and  one  external 
arc  ei,  the  recommended  locator  configuration  is  using  one  V-pad  and  one  circular 
locating  pin.  As  distinguished  from  circular  locating  pins,  assembling  a  V-pad  requires 
two  locating  holes  in  base  plate  instead  one,  and  the  orientation  of  the  V-pad  can  not  be 
arbitrary  and  must  have  four  perpendicular  orientations. 
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As  shown  in  Figure  1 1,  a  V-pad  is  placed  around  the  origin  of  the  base  plate  and 
oriented  in  one  of  the  four  possible  perpendicular  orientations.  The  center  of  the  locating 
arc  O,  as  weU  as  the  contacting  points  between  V-pad  and  the  workpiece  are  then 
determined.  The  position  of  the  circular  locating  pin  may  be  found  through  lOtaUng 
the  workpiece  while  maintaining  a  2-point  contact  with  the  external  arc  e.  with  the  V-pad. 
The  locus  of  the  round  locating  pin  is  a  part  of  the  annulus  centered  in  the  fixed  locating 
arc  center  whose  inner  and  outer  diameters  of  the  annulus  are  the  minimum  and  maximum 
drstance  between  the  arc  center  and  the  line  segment  eat 

PaCOi,  Cl,  Ca)  -  {pa  (x,  y)  [  min-dist(0] ,  ea)  <  Oipa  <  max-dist(Oi,  ea)}  (23) 
nie  angle  scope  of  the  partial  annulus  is  determined  by  the  possible  rotation  angle  of  the 
locating  arc  about  the  V-pad  without  loss  of  contact. 

cxnun -90°  +  p<angle<(w90°-p  ^24) 

where  is  the  minimum  angle  between  e,  and  ea  with  reference  O,;  (w  is  the 

maximum  angle  between  e,  and  ea  with  reference  O,;  and  p  =  45°  for  90°  V-pad  or  30° 
for  120°  V-pad. 


A  locating  configuration  may  require  using  one  half- Vee  (or.other  line-contact  locators) 
for  an  arc  segment.  The  assembly  character  of  half-Vee  locators  is  more  complicated  than 

a  V-pad.  Theshapeofahalf-VeeisshowninFigurel2.a.  There  are  three  locating  holes 

in  one  half-Vee.  When  assembling,  two  holes  in  the  half-Vees  are  needed  to  be  accurately 
ahgned  with  two  locating  holes  in  base  plate.  There  are  only  four  possible  directions  for 
the  half-Vee  when  assembled  to  the  base  plate.  In  this  report,  the  two  locating  holes 

VLH,  and  VLHi  with  equal  distance  to  the  oblique  edge  are  analyzed.  Other  half-Vee 
shapes  should  be  addressed  via  the  same  method. 

First,  a  half-Vee  is  placed  in  a  specific  position  on  the  base  plate  by  aligning  VLH. 
and  VLH^  with  two  locaUng  holes  BLH,  and  BLH^  centered  at  H,  and  H,  in  the  base 
plate.  When  the  given  arc  (eO  centered  at  Oi  maintains  contact  with  the  half-Vee,  it  can 
be  transfonned  as  growing  the  given  arc  by  r  (Figure  12.a).  The  contact  situation  can  be 
thought  as  an  arc  rolling  over  a  line  segment 
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e  second  locator  posWon  reladve  to  the  second  locating  edge  (e.)  can  be  found 
u^mg  geometric  locus  analysis.  When  e.  maintains  a  one-point  contact  with  a  half-Vee 
the  woritptece  can  translate  and  mtate,  and  e,  sweeps  out  which  may  be  confined  to  a  ’ 
geomeny  centered  at  the  connecUon  line  of  the  two  locating  holes  in  the  half-Vee.  The 
geomeny  n,  derived  by  the  swept  partial  annulus  when  the  am  contacting  the  different 
P<».t.on  on  the  half-Vee  simply  relates  without  slip.  The  locus  may  be  fnrtherrefined  by 
constdenng  angle  limitations  of  are  rotation  (Hgnre  12.b).  Genendly,  the  swept  geomeny 
by  e.  can  be  defined  as  a  ribbon  by  satisfying  such  conditions; 

1)  The  locus  geometry  is  relative  to  a  reference  line  segment  H.Hj  where: 

cIistance(Oi,HiH2)  =  R* 

(25) 

where  R.  is  the  expanded  radius  of  e.  and  in  the  outer  nomtai  direction  of  H,H,. 

2)  The  two  limit  line  segments  are  deteimined  by  off-setting  H.Hj  threngh 

distl  =  maximum-distance-refer-to-HiH2  (Oi,  62) 

dist2  =  minimura-distance-refer-to-HiH2  (Oj,  ej)  ^26) 

which  means  the  distance  in  the  direction  perpendicular  to  H.Hj  (Figure  12.c). 

^‘^““'“"<'*<«”»fistoignedtobeacireularlocatingpin,theposin'onoftt^ 

locator  may  be  chosen  among  the  generated  locus.  If  the  second  locatoris  designed  to  be 
another  half-Vee,  the  position  of  the  second  half-Vee  may  be  found  through  a  similar 
assembly  analysis.  Noting  that  the  recond  half-Vee  can  be  transformed  to  an  ideal  line 
segment,  posittomng  of  the  second  half-Vee  is  required  to  find  the  position  of  the  line 
segment  Tlte  Ime  covering  the  line  segment  should  be  fust  detennined  and  then  the 
relative  hne  segment  may  be  detennined  such  that  the  line  segment  contacts  the 

worlcprece.  Any  line  intersecting  the  generated  locus  may  be  a  candidate.  For  the  third 

ocator  placement,  i.e.,  locating  edge  e,,  the  position  can  be  found  by  considering  the 

rntersectron  of  the  locus  of  two  pairs;  e,  with  e,,  and  er  with  e,.  The  intersection  can 

cover  the  swept  geometry  by  maintaining  contact  with  e,  and  e.  with  the  workpiece 

rgure  12.d  shows  the  swept  region  of  possible  positions  for  the  third  locator,  which  needs 

hr  further  constrained  by  the  feasibie  rotation  angies  of  the  workpiece  reiarive  to  the 

nalr-Vees  when  they  are  in  contact 


4^-D  Fixture  Confipiii^t^»»^ 

2-D  fixture  planning  as  discussed  above  is  limited  to  prismatic  workpieces  where  the 

height  of  the  woApiece  is  relatively  small.  The  vast  majority  of  workpieces  are  three- 

dimensional,  and  therein  it  is  desirable  to  extend  the  above  2-D  strategies.  Such  a  fixmre 

configuration  design  system  has  been  developed,  where  when  fixturing  points  are 

specified,  fixturing  units  can  be  autoraaUcaily  generated  [2].  In  this  section,  a  3-D 

automated  modular  fixture  planning  procedure  is  presented  foUowed  by  3-D  assembly 
analysis. 

2zD  automated  modular  fixture  planninp  prnrpH^im 

Prior  to  fixture  planning,  the  orientation  of  the  workpiece  relative  to  the  base  plate 
as  weU  as  machining  surfaces  in  each  setup  must  be  detetmined  in  setup  planning  pfet, 

although  'he  workpiece  geometty  could  be  very  complex,  only  four  kinds  of  surfaces  need 

be  considered  for  locating  purposes:  pianes  parallel  to  the  base  plate  (surface  type  A), 
planes  peipendicular  to  the  base  plate  (surface  type  B),  cylindrical  surfaces  with  an  axis 
parallel  to  the  base  plate  (surface  type  C),  and  cylindrical  surfaces  with  an  axis 

peipendicular  to  the  base  plate  (surface  type  D).  A  3-D  automated  modular  fixture 
planner  is  outlined  in  Figure  13. 

A,  Determination  of  candidate  locating  .surface  .sp.t 

The  first  step  in  using  a  3-D  automated  modular  fixture  planning  procedure  is  to  find  all 
candidate  locating  surfaces  based  on  the  above  Tour-kinds-of-surfaces’  assumption.  The 
candidate  locating  surfaces  can  be  obtained  by  retrieving  the  CAD  model  of  the 
workpiece.  The  candidate  locating  surface  set  can  be  further  refined  if  we  assume  that 
locating  can  be  divided  into  two  types:  horizontal  and  vertical  locating.  Surfaces  of  type 
B  and  type  D  can  be  used  for  horizontal  locating.  Surfaces  of  type  A  and  type  C  can  be 

used  for  vertical  locating.  For  vertical  locating,  those  planes  whose  external  noimal  is 
opposite  to  the  base  plate  are  discarded. 
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B.  Locating  snrffl<;»^  grpup  selection 

The  next  step  is  to  select  horizontal  locating  surfaces  and  vertical  locating  sutfaces  from 
the  candidate  locating  surface  set  Generally,  three  sutfaces  for  each  locating  purpose 
should  be  selected  as  a  group.  The  three  vertical  locating  sutfaces  could  be  reduced  to  a 
singular  surface.  The  three  horizontal  locating  sutfaces  could  be  reduced  to  two  surfaces 
with  one  surface  be  chosen  twice.  The  locating  surface  groups  are  selected  by  considering 
accuracy  relaUonships,  geometric  accessibiUty,  operational  conditions.  A  priority  index 
may  be  generated  for  each  locating  surface  group  so  that  the  surface  group  with  the 
highest  priority  will  be  processed  firsL  If  later  this  strategy  fails  to  provide  a  reasonable 

fixture  plan,  the  suiface  group  with  the  next  highest  priority  index  is  chosen  until  one 
reasonable  fixture  plan  is  generated. 

C.  Horixnntal  locating 

The  third  step  involves  horizontal  locating.  Horizontal  locating  suiface  groups  have  been 
chosen  in  the  second  step.  Considering  each  side  as  a  locating  surface,  one  locating  unit 
(which  usually  consists  of  one  locator  and  several  supporting  components)  is  constructed 
by  usmg  the  automated  fixture  configuration  design  functions  (2).  When  the  height  of  the 
locating  points  are  approximately  deteimined.  e.g.,  the  half-height  position  of  the  side 
locating  sutfaces.  the  locating  units  for  each  side  locating  surface  are  generated  with 
assembly  relationships  between  fixmre  components  of  the  units.  The  assembly  analysis  is 
then  performed  to  place  these  locating  units  on  the  base  plates. 

Generally,  the  position  of  a  3-D  workpiece  is  determined  by  six  parametere:  three 

translation  parameters  (x,  y  and  z)  and  three  rotational  parameteis  (a,  p,  y)  about  x.  y,  and 

z  axes.  Since  the  workpiece  should  maintain  relative  orientation  to  the  base  plate,  the 

rotational  parameters  about  x  and  y  axes,  a  and  p,  are  fixed.  After  all  the  side  locating 

units  are  placed,  three  position  parameters  (x,  y,  and  y)  will  be  deteimined.  The  parameter 

z  will  be  determined  by  the  clearance  requirement  between  the  workpiece  and  the  base 
plate. 
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D.  Vertical  locating 

In  the  vertical  locating,  the  locators  are  first  chosen  by  considering  the  types  and  surface 
finish  of  vertical  locating  surfaces.  Similarly,  the  vertical  locating  units  are  generated  by 
applying  the  automated  fixture  configuration  design  functions. 

E.  Clamping  design 

In  clamping  design.  The  number  and  type  of  clamps  employed  should  be  first  decided  base 
on  workpiece  stabUity  analysis  and  operational  rules.  All  possible  clamping  faces  are  then 
collected  into  a  set  A  combination  of  several  candidate  clamping  surfaces  are  then 
selected.  Assembly  analysis  is  performed  to  place  the  clamps  on  the  base  plate  given  the 
assembly  character  of  the  clamps.  Detailed  analysis  and  discussion  of  clamp  planning  can 
be  found  in  reference  [20]. 

It  should  be  noted  that  automated  modular  fixture  planning  for  3-D  workpieces  is 
very  compUcated.  This  design  methodology  only  provides  a  framework  for  fundamental 
analyses  of  3-D  automated  modular  fixture  planning. 

3-D  Modular  Assembly  Analysis 

Modular  assembly  analysis  is  the  focus  of  this  report,  where  in  modular  assembly  analysis 
for  2-D  situations  is  expanded  to  3-D.  In  3D  situations,  locating  units  instead  of  locators 
are  the  major  concerns  when  conducting  the  assembling  analysis.  Figure  14  shows  a 
sketch  of  locating  units.  A  locating  unit  typically  consists  of  a  locator  on  the  top  and 
several  supporting  components.  Below,  only  horizontal  locating  units  are  discussed  since 
the  assembly  of  vertical  locating  units  is  relatively  easy.  The  side  locating  units  are  divided 
into  two  categories  based  on  the  characters  of  their  locatoia:  direction-fixed,  and 
direction-variable. 

When  a  workpiece  maintains  contact  with  an  edge  bar,  the  contact  direction  is 
fixed.  If  the  locator  is  a  round  locating  pin,  locating  tower  or  adjustable  stop,  the  contact 
direction  of  the  locator  can  change  randomly  corresponding  to  the  locating  surfaces  on  the 
workpiece.  Placing  the  direction-fixed  locating  units  wiU  pose  additional  constraints  on 
the  direction  of  the  side  locating  surfaces.  In  other  words,  two  direction-fixed  locating 
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units  may  conflict  if  their  locating  directions  are  not  compatible.  However,  using  a 
direction-fixed  locating  imit  will  also  simplify  the  assembly  process  because  of  the 
assembly  constraints.  Direction-variable  locating  units  are  often  more  flexible.  Direction- 
variable  locating  units  will  be  discussed  below. 

Given  a  locating  unit,  the  bottom  component  is  connected  with  the  base  plate. 
Generally,  the  bottom  component  may  use  two  locating  holes  to  accurately  determine  the 
position  and  orientation  of  the  bottom  component.  If  two  locating  holes  are  needed,  the 
placement  of  the  locating  unit  can  have  only  four  directions  parallel  to  the  base  plate 
symmetrical  axes.  The  other  important  component  in  the  unit  is  the  locator  which 
contacts  with  the  workpiece.  When  the  locating  unit  is  generated,  aU  the  components  in 
the  locating  unit  are  determined  and  their  relative  positions  are  also  determined  [2]. 
Therefore,  the  relative  position  of  the  locator  to  the  bottom  component  can  be  derived 
which  is  veiy  important  to  assembly  analysis. 

In  3D  situations  it  is  assumed  that  there  are  three  generated  side  locating  units: 
SLUl,  SLU2,  and  SLU3  which  are  designed  to  contact  with  the  three  side  locating 
surfaces:  SI,  S2  and  S3.  First,  the  3-D  workpiece  is  projected  onto  the  base  plate  and 
become  a  2-D  geometry.  Since  SI,  S2  and  S3  are  planes  or  cylindrical  surfaces 
perpendicular  to  the  base  plate,  three  segments  of  lines  or  arcs  are  achieved  with  respect 
to  the  three  side  locating  surfaces.  They  are  then  expanded  by  the  radius  of  each  locator 
respectively  to  get  three  segments  of  lines  or  arcs  (si,  s2  and  s3)  and  the  locators  can  be 
reduced  to  ideal  points  (Figure  15).  SLUl  is  placed  around  the  origin  of  the  base  plate 
and  locator  1  is  also  positioned.  Thus,  si  should  maintain  contact  with  locator  1,  while  si 
can  rotate  and  slip.  S2  sweeps  out  an  annulus  centered  at  locator  1  Just  like  the  2-D 
situation  (Figure  16).  The  position  of  SLU2  can  be  determined  by  transforming  all  the 
possible  placement  origins  of  bottom  components  by  the  x,  y  offsets  of  the  locator  which 
may  have  four  directions.  All  possible  transformed  placement  origins  falling  inside  the 
swept  annulus  will  be  suitable  as  candidate  SLU2  locations.  In  the  same  way,  SLU3  can 
be  positioned  by  considering  s3  pairwise  consistent  with  si  and  s2. 

When  all  side  locating  units  are  placed,  their  positions  are  sent  to  another  module 
to  calculate  the  x,  y  translation  position  and  y  rotational  position. 
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5.  Examples  and  Sumarry 

A  geometric  analysis  for  automated  fixture  planning  has  been  presented,  which  is  an 
expansion  of  previous  research  on  automated  fixture  configuration  design  and  2-D 
geometric  synthesis.  Cylindrical  surfaces,  different  types  of  locating  components,  and  3-D 
fixture  configurations  have  been  considered  in  the  analysis.  Figure  17  shows  two 
examples  of  fixture  designs  resulting  from  the  fixture  planning  and  fixture  configuration 
design.  A  comprehensive  automated  fixture  planning  and  configuration  design  system  is 
under  development  where  analyses  of  locating  accuracy,  geometric  accessibility,  clamp 
planning,  and  fixture  design  stability  are  included. 
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Table  1.  A  partial  list  of  possible  locator  configurations 


Locating  edge 
combinations 
three  line  segments 


two  line  segments 
and  one  external  arc 

one  line  segment  and 
two  external  arcs 


one  line  segment  and 
one  external  arc 

three  external  arcs 


two  line  segments 
and  one  small 
internal  circle 
two  line  segments 
and  one  large 

internal  arc _ 

one  line  segment  and 
two  small  internal 

circles _ 

two  small  internal 
circles 


three  large  internal 
arcs 


locator 

configuration  #1 
three  locating 
towers  (b) 

two  round  locating 
pins  (a)  and  one 
half-Vee  (d 
one  round  locating 
pin  (a)  and  two  half- 
Vees  (d) 


one  round  locating 
pin  (a)  and  one  V- 
ad  (e) 
three  half-Vees(d) 


two  round  locating 
pins(a)  and  one 
diamond  hole  pin( 
three  round  locating 
pins(a) 

one  adjustable  stop 
(c)  and  two  diamond 
pinsi 

one  round  hole 
pin(f)  and  one 

diamond  pini _ 

three  round  locating 
pins(c) 


two  round  locating 
pins(c)  and  one 

^ - - - -  adjustable  stop(c,  , 

two  Ime  segments  may  degenerate  into  one;  arc  and  circle  may 
thing;  and  two  half-Vees  may  be  equivalent  to  one  V-pad. 


locator 

configuration  #2 
three  round  locating 
pins  (a) 

three  round  locating 
pins  (a) 

three  round  locating 
pins  (a) 


three  round  locating 
pins  (a) 

three  round  locating 
pins(a) 


two  round  locating 
pins(a)  and  one 
adjustable  stopCc) 


locator 

configuration  #3 
two  round  locating 
pins  (a)  and  one 
adjustable  stop  (c) 
two  round  locating 
pins  (a)  and  one 
adjustable  stop  (c) 
two  round  locating 
pins  (a)  and  one 
adjustable  stop  (cl 


two  round  locating 
pins  and  one 
adjustable  stopfcl 
two  round  locating 
pins  and  one 
adjustable  stop  (c) 


mean  the  same 
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Figure  5.  Examples  of  invalid  locating  designs 
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(a)round  locating  pin 


(b)  locating  tower 


Figure  6  Locators  to  be  considered 
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Figure  10.  Pin-hole  locating  with  adjustable  bar 


Figure  14  A  sketch  of  fixure  units 
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Figure  1  l  .b  V-block  assembly  analysis. 
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Figure  12.aHalf-Vee 


Figure  12.b  Half-Vee  assembly  analysis  (1) 
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Figure  12. c  Half-Vee  assembly  analysis  (2) 


Figure  12.d  Two  half-Vee  assembly  analysis 


determine  candidate  locating/clamping  surface  set 


select  one  group  of  locating  surfaces 


select  fixture  components:  type  and  size 

generating  three  side  locating  units 
based  on  height  requirement _ 

assemble  side  locating  umts  onto  the 
base  plate  and  determine  position  of 
workpiece  and  locating  points _ 

_ Z _ 

generating  all  the  vertical  locating  units 
based  on  height  requirement _ 

placing  these  vertical  locating  units 
base  plate  and  determine  position  of 
workpiece  and  locating  points 


select  clamping  surface  group 

X 

select  clamp  type  and  assembling  clamp 


fixture  planning  venfication 


ixture  planning  successful? 
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Figure  15.  3-D  fixturing  unit  assembly 
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Abstract 

Bladed  disk  assemblies  in  aircraft  engines  are  prone  to  high  cycle  fatigue  as  a  result  of  localization  of 
vibrational  energy.  In  order  to  prevent  mode  localization,  design  strategies  for  distributing  dynamic  stress  over  the 
entire  system  are  examined.  It  is  shown  that  stress  reductions  of  up  to  75%  can  be  obtained  via  minor 
modifications  of  basic  disk  design. 
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A  DESIGN  STRATEGY  FOR  PREVENTING  fflGH  CYCLE  FATIGUE  BY  MINIMIZING  SENSITIVITY  OF 

BEADED  DISKS  TO  MISTUNING 


Joseph  C.  Slater 
Andrew  J.  Blair 


Introduction 

The  inside  of  a  gas  turbine  is  one  of  the  harshest  environments  for  mechanical  systems  in  all  of  the  man-made 
world.  High  temperatures,  high  forces  and  large  dynamic  loads  make  reliability  design  an  exceptionally 
challenging  task.  Although  jet  engines  statistically  are  very  reliable  overall,  this  reliability  is  likely  to  decline  as 
aircraft  are  used  more  often  to  perform  missions  for  which  they  were  not  designed.  A  design  is  only  as  robust  as 
the  qualification  test  it  undergoes  for  approval.  Many  aircraft  in  the  Air  Force’s  aging  fleet  were  designed  for  quite 
different  roles  than  those  they  are  currently  being  used  for.  This  leads  to  the  likelihood  of  performance 
degradation,  reduced  reliability,  and  shorter  time  to  failure.  A  leading  cause  of  failure  in  jet  engines  is  fatigue, 
both  low  cycle  and  high  cycle.  Low  cycle  fatigue  is  a  result  of  high  loads  being  applied  to  an  object  over  a 
relatively  low  number  of  cycles.  On  the  other  hand,  high  cycle  fatigue  is  the  result  of  lower  loads  being  applied  to 
an  object  for  a  large  number  of  cycles.  This  is  commonly  the  result  of  vibration  over  an  extended  period  of  time. 

Mode  localization  in  bladed  disks  is  a  vibration  phenomenon  where  the  symmetric  mode  shapes  common  to  a 
perfectly  symmetric  (tuned),  bladed  disk  degrade  due  to  the  introduction  of  slight  variations  within  the  blades.  The 
presence  of  these  slight  variations  is  commonly  referred  to  as  mistuning.  Modal  motion  occurs  primarily  in  a  few 
of  the  blades  on  the  disk  (often  called  the  “rogue”  blades).  Since  all  of  the  modal  energy  is  confined  to  a  small 
number  of  blades,  the  amplitudes  of  the  motion  of  these  blades  is  greatly  increased,  resulting  in  greatly  increased 
stresses,  and  reduced  fatigue  life. 

In  order  to  gain  a  thorough  understanding  of  the  localized  behavior  of  a  mistuned  model,  it  is  first  necessary  to 
understand  the  ideal  mode  shapes  and  frequency  distributions  of  the  tuned  system.  Take  for  instance  the  following 
bladed  disk  where  the  radial  lines  represent  blades  of  the  disk. 


Figure  1:  Undeformed  Bladed  Disk 

For  the  sake  of  illustration,  shortening  or  lengthening  of  the  radial  line  denotes  deflection  of  a  blade.  The  outer 
circle  in  the  preceding  figure  represents  the  nominal  position  of  the  undeformed  blade.  In  a  real  bladed  disk,  the 
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important  deflections  of  the  blades  are  usually  bending,  twisting,  and  combinations  of  the  two  —  motions  very 
similar  to  those  of  a  cantilevered  beam.  For  this  iUustrative  example  the  deflection  is  represented  as  axial 
shortening  or  lengthening.  This  represents  a  simplification  of  the  blade  dynamics  to  those  of  a  single  beam-like 
mode. 

Since  this  is  an  eight  degree  of  freedom  model,  eight  linear  modes  can  be  expected.  For  the  perfectly  tuned 
system,  these  modes  will  be  repeated  and  symmetric  (except  for  two  of  the  modes)  as  shown  below.  The  gray  lines 
represent  the  nodal  diameters  (lines)  along  which  there  is  no  deflection. 


Figure  2:  First  Mode  (Zero  nodal  lines) 


Figure  3:  Second  and  Third  Modes  (Repeated,  with  1  nodal  diameter) 


Figure  4:  Fourth  and  Fifth  Mode  (Repeated,  with  2  nodal  diameters) 
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Figure  5:  Sixth  and  Seventh  Mode  (Repeated,  with  3  nodal  diameters) 


Figure  6:  Eighth  Mode  (Not  repeated,  4  nodal  diameters) 

It  is  reasonable  to  expect  the  modes  to  be  repeated  in  this  fashion  when  considering  the  symmetry  involved  in 
the  structure.  The  preceding  figures  accurately  depict  the  first  of  each  type  of  modal  group  (beam  bending  and 
torsion).  It  is  not  difficult  to  extend  this  analogy  of  nodal  diameters  to  the  higher  groups,  though.  The  first  group 
can  be  thought  of  as  combinations  of  nodal  diameters  with  zero  nodal  circles.  As  the  modal  group  number 
inaeases,  the  number  of  nodal  circles  inaeases.  So,  the  second  bending  mode  would  have  one  nodal  circle,  the 
third  would  have  two  nodal  circles,  and  so  on.  In  the  present  study,  the  first  and  second  bending  modal  groups  and 
the  first  torsional  modal  group  were  investigated. 

The  natural  frequencies  of  each  modal  group  in  these  systems  are  usually  tightly  grouped  as  well,  especially  in 
the  lower  groups.  It  is  not  uncommon  for  all  of  the  frequencies  in  one  group  be  within  one  percent  of  each  other. 
Typically  there  is  also  a  significant  change  in  the  mean  frequency  between  modal  groups. 

The  present  study  investigates  the  effects  of  mistuning  on  the  mode  shapes  and  frequencies  of  an  eight  bladed 
disk,  as  well  as  design  alternatives  to  minimize  these  effects.  Some  previously  used  and  accepted  standards  will  be 
used  to  qualify  the  work.  In  addition,  some  new  and  unique  measures  are  also  formulated  to  summarize  the 
findings. 

Background 

Under  certain  conditions,  due  to  the  high  flexibility  of  the  blades  relative  to  the  disk,  bladed  disks  can  become 
very  susceptible  to  mode  localization  as  a  result  of  blade  mistuning  (the  variation  of  dynamic  properties  among  the 
blades  attached  to  the  disk).  The  development  of  analytical  methods  for  predicting  the  natural  frequencies  and 
mode  shapes  of  tuned  bladed  disks  has  advanced  sufficiently  for  use  as  design  tools  (although  Swaminadham,  Soni, 
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Stange,  and  Reed'**  demonstrate  that  significant  wo±  still  needs  to  be  done).  What  is  lacking  is  the  ability  to 
predict  the  sensitivity  of  a  design  to  mode  localization  that  may  result  as  a  function  of  some  specified  mistuning, 
and  knowledge  of  how  to  design  bladed  disks  that  are  insensitive  to  blade  mistiming 

Gnffin  and  Hoosac^’  performed  simulations  of  a  simplified  model  of  a  bladed  disk  to  generate  a  large  number 
of  simulation  results  firom  which  to  draw  statistical  conclusions.  The  model  consisted  of  72  three  degree  of 
freedom  systems,  representing  the  72  blades  of  a  bladed  disk.  The  systems  were  coupled  at  the  base  by  springs  and 
each  was  connected  to  ground  at  the  end  mass  by  a  dashpot  to  represent  system  damping.  The  end  masses  and 
their  corresponding  spring  stiffnesses  were  varied  to  represent  the  effect  of  mistuning.  The  simulation 
demonstrated  that,  under  the  worst  case  scenario,  the  amplitude  of  a  single  blade  could  easily  double  the  normal 
amplitude  of  a  similarly  excited  tuned  system.  The  shape  of  the  scatter  plot  seems  to  indicate  that  the  worst  case 
scenario  was  not  achieved  and  that  the  worst  case  would  be  catastrophic.  Further  analysis  demonstrated  that  the 
range  of  blade  natural  frequencies  and  not  the  shape  of  the  distribution  of  the  blade  natural  fi-equencies  was  the 
dominant  influence  on  blade  amplitudes.  It  was  also  shown  that  the  peak  blade  response  occurred  very  near  to  the 
tuned  natural  frequency  -  validating  the  use  of  tuned  analysis  to  design  for  natural  frequency  avoidance. 

Griffin^®  uses  a  two  degree  of  freedom  model  similar  to  that  of  Basu  and  Griffin^  but  with  more  sophisticated 
coupling  to  represent  aerodynamic  coupling  of  the  blades.  The  model  was  capable  of  predicting  the  scatter  of  blade 
amplitudes  in  some  but  not  all  cases.  It  was  also  found  that  systems  that  are  sensitive  to  mistuning  also  tend  to  be 
numerically  sensitive  to  modeling  inaccuracies,  suggesting  that  a  deterministic  approach  to  analyzing  mistuning 
effects  may  not  be  possible  with  existing  modeling  techniques. 

Valero  and  Bendiksen  developed  a  three  degree  of  freedom  blade  model  that  incorporated  rotation,  shroud 
slippage,  and  friction.  The  shroud  friction  model  assumes  that  the  entire  interface  slips  or  sticks  as  a  unit.  This 
approach  is  then  formulated  as  a  linearized  eigenvalue  problem.  The  conclusion  drawn  was  that  the  shroud 
interface  angle  alters  the  natural  frequencies  and  the  amount  of  friction  damping  observed.  Surprisingly,  the 
effects  of  mistuning  were  independent  of  the  shroud  interface  angles.  The  mistuning  effects  tended  to  occur  in  the 
lowest  modes  where  less  deformation  occurs  in  the  hub  (and  thus  the  coupling  between  the  blades  is  less  apparent). 
It  was  also  noted  that  the  highly  localized  modes  occurred  when  mistuning  was  highly  concentrated  in  a  few 
blades.  In  addition,  the  authors  hypothesized  that  mode  localization  can  be  minimized  by  enhancing  the  interblade 
coupling  through  shrouds. 

Two  papers  by  Ewins‘^’  represent  standard  reference  material  for  understanding  the  modeling  of  bladed  disks. 
Ewins*^  shows  that  under  some  mistuning  conditions,  blades  may  suffer  stress  levels  as  high  as  20%  greater  than 
in  a  tuned  system.  However,  rearranging  the  same  blades  on  the  same  hub  can  minimize  mode  localization.  An 
optimal  arrangement  of  blade  locations  is  proposed,  although  engineers^  still  have  great  difficulty  in  identifying 
the  variations  between  a  given  set  of  blades  for  this  purpose.  Ewins*^  shows  that  many  more  resonant  frequencies 
exist  for  mistuned  bladed  disk  assemblies  due  to  the  splitUng  of  repeated  modes  into  independent  modes.  Ewins*^ 
shows  that  under  some  mistuning  conditions,  blades  may  suffer  stress  levels  as  high  as  20%  greater  than  in  a  tuned 
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system.  These  results  are  in  contrast  to  those  of  Dye  and  Henry”  and  Whitehead'*^  Dye  and  Henry”  show  that 
almost  a  3-fold  increase  in  response  amplitude  can  occur  in  the  presence  of  mistuning,  while  Whitehead"*^ 

analytically  shows  a  theoretical  increase  of  |o.5^1 + Vn/2)j . 

Ewins  and  Han”  used  a  two  degree  of  freedom  blade  model  to  study  the  response  of  a  33-bladed  disk.  They 
show  that,  for  this  specific  case,  blade  mistuning  always  results  in  an  increase  in  blade  amplitude  and  that  the 
blade  with  the  greatest  mistune  always  suffers  the  greatest  motion. 

Yang  and  Griffin’ s'**  substructuring  technique  used  the  clamped-free  modes  of  the  blades  to  generate  a  reduced 
model  of  mistuned  bladed  disk  assemblies.  They  showed  that  for  a  simple  bladed  disk  assembly,  the  reduced  model 
natural  frequencies  match  the  natural  frequencies  of  the  original  finite  element  model  almost  perfectly,  and,  the 
peak  forced  response  occurs  at  a  frequency  approximately  1%  higher  than  the  “true”  frequency.  For  the  mistuned 
case,  the  full  and  reduced  models  agree  well.  The  exception  was  in  a  case  where  the  tuned  disk  exhibits  frequency 
veering,  exactly  the  case  in  which  mode  localization  occurs. 

Irretier*®  applied  a  modified  component  mode  synthesis  to  reduce  a  complete  bladed  disk  finite  element  model 
to  a  smaller,  more  tractable  problem.  He  showed  that  the  manner  of  frequency  shifting  due  to  mistuning,  and  the 
corresponding  change  in  mode  shapes,  were  strongly  dependent  on  the  type  of  mistuning. 

Kaza  and  Kielb“  and  Kielb  and  Kaza^’  used  aerodynamically  coupled  single  degree  of  freedom  blade  models 
for  their  bladed  disk  vibration  analysis.  They  suggested  that  the  effects  of  mistuning  can  be  beneficial  or  adverse 
depending  on  the  engine  order  of  the  forcing  function.  A  significant  result  was  that  it  may  be  possible  to  use 
designed  mistuning  to  raise  the  blade  flutter  speed  without  seriously  degrading  the  forced  response,  although  the 
benefits  of  mistuning  level  off  at  about  5%  mistune.  Bendiksen^  showed  a  similar  result.  Damping  was  shown  to 
be  much  more  effective  when  the  blades  are  well  tuned,  which  may  cause  problems  when  significant  damping 
exists  in  the  tuned  system.  A  more  sophisticated  model  showed  many  of  the  same  results  (Kaza  and  Kielb^*). 

Muszynska  and  Jones^°  developed  a  five  degree  of  freedom  blade  model  incorporating  Coulomb  shroud  friction. 
Coulomb  blade  to  hub  friction  and  structural  damping.  Their  model  showed  that  mistuning  increases  the  response 
amplitude  and  that  appropriate  design  of  friction  dampers  can  reduce  the  response  by  as  much  as  an  order  of 
magnitude  as  compared  to  a  non-optimally  designed  friction  damper.  They  also  reported  that  the  optimal  damper 
design  effectiveness  is  optimal  for  both  the  tuned  and  mistuned  cases,  although  the  ampUtude  for  the  mistuned 
cases  were  still  higher  than  the  ampUtude  for  the  tuned  case.  An  unexpected  effect  was  that  the  friction  damping, 
due  to  its  nonlinear  nature,  causes  nonlinear  coupling,  inducing  mode  localization  to  some  degree.  Vakakis'*^  has 
shown  that  when  purely  nonUnear  coupling  exists,  mode  localization  can  occur  in  the  absence  of  mistuning. 

Petrov^*  combined  finite  element,  substructuring,  transfer  matrix,  and  dynamics  compliance  methods  to  develop 
a  complex  bladed  disk  model  including  shroud,  joint,  material  damping,  aerodynamic,  and  cable  effects  (for  steam 
turbines).  Isoparametric  elements  were  used  in  the  joint  sections  to  model  the  complex  geometries.  A 
condensation  technique  was  applied  to  reduce  the  size  of  the  matrices^*.  The  code  shows  that  slight  mistuning 
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drastically  alters  the  clean  transfer  functions  obtained  for  a  tuned  system,  creating  numerous  resonances  where 
only  a  handful  previously  existed. 

Wei  and  Pierre  demonstrated  that  the  sensitivity  of  a  bladed  disk  to  mode  localization  as  a  result  of  mistuning 
is  directly  related  to  the  ratio  of  the  mistuning  strength  to  the  coupling  strength  using  a  single  degree  of  freedom 
blade  model.  They  showed  that  the  effects  of  mistuning  are  minimal  when  coupling  is  great.  When  coupling  is 
weak,  however,  the  bladed  disk  is  very  sensitive  to  mistuning.  Thus,  a  bladed  disk  assembly  that  shows  a  great 
deal  of  motion  of  the  hub  when  moving  in  a  mode  will  be  less  susceptible  to  the  effects  of  mistuning.  Since  more 
relative  motion  occurs  in  the  hub  in  higher  modes,  it  seems  likely  that  higher  modes  should  be  less  susceptible  to 
blade  mistuning. 

Pierre  and  Murthy”  and  Pierre,  Smith,  and  Murthy^®  included  aerodynamic  coupling  of  the  blades  in  their 
perturbation  approach  to  determination  of  the  effects  of  blade  mistuning.  Since  it  was  shown  that  the  low  coupling 
between  the  blades  is  the  cause  of  the  propensity  for  mode  localization,  Pierre  and  Murthy  applied  the  approach  of 
Wei  and  Pierre'^  in  which  the  coupUng  treated  as  the  perturbation  of  n  originally  independent  blades  with  slighUy 
different  modal  parameters.  This  is  in  contrast  to  procedures  that  include  the  mistuning  as  the  perturbation  of  an 
originally  tuned  system.  A  heuristic  explanation  for  why  this  may  woik  is  that  when  mode  localization  occurs,  the 
blades  act  almost  as  independent  structures.  Since  the  nominal  structure  used  by  Wei  and  Pierre'”  was  the  set  of 
independent  blades,  it  is  reasonable  to  expect  that  this  should  yield  better  results  when  localization  occurs.  Pierre 
and  Murthy”  also  reported  that  blades  similar  in  frequency  tend  to  vibrate  together  in  a  localized  mode,  even  when 
the  blades  between  them  do  not  show  significant  motion. 

A  summary  of  these  results  leads  to  the  following  conclusions: 

1)  Detailed  finite  element  analysis  of  complex,  tuned  bladed  assemblies  is  prone  to  large  errors  when  mode 
localization  occurs  due  to  the  intrinsic  numerical  ill-conditioning.  However,  degree  of  ill-conditioning  is  an 
indicator  of  the  sensitivity  of  the  design  to  mode  localization. 

2)  Detailed  finite  element  analysis  (FEA)  of  complex,  mistuned  bladed  assemblies  can  be  extremely  costly  due 
to  the  inabiUty  to  apply  perfect  symmetry  relations.  Accurate  model  reduction  techniques  are  necessary. 

3)  Even  if  detailed  FEA  of  complex,  mistuned  bladed  disks  could  yield  valid  results,  usefulness  for  design  is 
questionable^^. 

4)  The  most  promising  method  of  gaining  a  detailed  finite  element  model  that  is  capable  of  incorporating  the 
detailed  effects  of  blade  mistuning  is  to  apply  component  mode  synthesis  in  one  of  its  various  forms^' 

15,  18,  24,  25,  32,  33,  40  „  j  t  -19 

,  as  performed  by  Irretier  . 

5)  Mistuning  has  the  greatest  effect  when  coupling  between  the  blades  is  weakest.  Added  coupling  through 
shrouds  is  likely  to  reduce  mode  localization,  minimizing  the  effects  of  bladed  mistuning. 

6)  In  addition,  since  higher  sets  of  bladed  disk  modes  tend  to  have  greater  coupling  between  the  blades,  mode 
localization  may  be  a  phenomenon  of  interest  only  with  respect  to  the  lower  frequency  sets  of  modes. 
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7)  Dry  friction  can  cause  mode  localization  to  occur  in  the  absence  of  blade  mistuning.  Thus,  in  any  assembly 
where  friction  dampers  are  used,  the  “effects”  of  blade  mistuning  exist  regardless  of  how  well  the  blade 
frequencies  are  tuned. 

8)  Blade  mistuning  is  not  guaranteed  to  cause  significant  mode  localization.  The  same  set  of  mistuned  blades 
will  exhibit  symmetric  modes  or  localized  modes  depending  on  the  arrangement  of  the  blades  on  the  hub. 
Little  is  understood  about  why  this  is  the  case,  or  what  way  to  order  the  blades  to  minimize  this  effect. 

9)  It  is  likely  that  blade  mistuning  can  cause  responses  well  above  those  reported  in  most  studies.  Monte  Carlo 
simulations  show  distributions  to  have  very  sharp  amplitude  peaks  indicating  extremely  large  worst-case 
scenarios  as  opposed  to  soft  peaks  which  would  indicate  milder  worst-case  scenarios. 

Problem 

In  bladed  disk  assemblies,  the  disk  acts  as  a  coupling  device  between  the  blades.  As  the  stiffness  of  the  disk 
increases,  blade  coupling  decreases.  It  has  been  shown  that  weak  interblade  coupling  leads  to  high  levels  of  mode 
localization  when  blades  are  mistuned”’^*''*^.  Mode  localization  also  occurs  in  bladed  disks  as  a  result  of  their 
symmetry.  Sets  of  axisymmetric  modes  combine  to  form  a  basis  set  from  which  drastically  localized  mode  shapes 
can  be  generated.  Bladed  disk  assemblies  are  traditionally  designed  to  be  symmetric  for  balancing.  Two 
hypotheses  are  investigated  in  this  work:  1)  Decreasing  the  stiffness  of  the  disk  by  varying  the  geometry  and/or 
material  composition  will  reduce  mode  localization  due  to  mistuning,  2)  Destroying  the  symmetry  of  the  disk,  yet 
maintaining  balance,  will  reduce  mode  localization  due  to  mistuning.  Each  of  these  was  investigated  separately,  as 
well  as  combinations  of  the  two. 

A  model  of  an  eight  bladed  disk  based  on  an  experimental  testbed  in  existence  at  Wright  State  University  was 
constructed  in  ANSYS®  using  eight  noded  brick  elements  (Figures  10  and  11,  p.  19).  The  model  was  designed  to 
exhibit  a  propensity  for  mode  localization  similar  to  that  in  real  bladed  disk  assemblies.  The  bladed  disk  model 
was  adjusted  to  provide  weak  coupling  between  the  blades — ^resulting  in  tightly  packed  sets  of  natural  frequencies, 
eight  modes  in  each.  The  blade  deformation  in  the  first  set  of  modes  is  predominately  a  first  beam  bending  mode, 
in  the  second  set  of  modes  it  is  predominately  a  first  beam  torsional  mode,  and  in  the  third  set  of  modes  it  is 
predominately  a  second  beam  bending  mode. 
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Frequencv  fH/.) 

Deformation  shape 

Nodal  lines.  Nodal  circles 

336.8 

1“  Beam  bending 

0,0 

336.8 

1”  Beam  bending 

1,0 

336.8 

1*’  Beam  bending 

1,0 

336.9 

1"  Beam  bending 

2,0 

336.9 

1“  Beam  bending 

2,0 

337.07 

1*'  Beam  bending 

3,0 

337.07 

1”  Beam  bending 

3,0 

337.15 

1“  Beam  bending 

4,0 

1411.4 

r'  Beam  torsion 

0,0 

1411.4 

1“  Beam  torsion 

1,0 

1411.4 

1**  Beam  torsion 

1,0 

1411.5 

1“  Beam  torsion 

2,0 

1411.5 

1“  Beam  torsion 

2,0 

1411.8 

1”  Beam  torsion 

3,0 

1411.8 

1“  Beam  torsion 

3,0 

1412.0 

1**  Beam  torsion 

4,0 

2066.1 

2”'*  Beam  bending 

0,  1 

2066.7 

2“'*  Beam  bending 

1, 1 

2066.7 

2““*  Beam  bending 

1, 1 

2072.8 

2“'*  Beam  bending 

2, 1 

2072.8 

2““*  Beam  bending 

2,  1 

2081 .4 

2“*'  Beam  bending 

3,  1 

2081 .4 

2““  Beam  bending 

3,  1 

2084.6 

2”**  Beam  bending 

4,  1 

Table  1;  Modal  characteristics  of  the  model  investigated. 


In  all,  seven  different  models  were  developed  for  this  study.  The  aforementioned  baseline  model  was 
symmetric  with  an  axisymmetric  disk  stiffness,  typical  of  traditional  bladed  disk  design.  In  the  second  model,  the 
symmetry  of  the  disk  was  destroyed,  yet  balance  was  maintained.  The  bending  stiffness  of  the  interior  portion  of 
the  disk  was  reduced  in  the  third  model.  In  the  fourth  model,  the  stiffness  of  the  exterior  of  the  disk  was  reduced 
while  the  interior  portion  of  the  disk  remained  unchanged.  The  fifth  model  was  a  combination  of  the  second  and 
third,  where  the  disk  was  non-symmetric  and  the  stiffness  of  the  interior  portion  of  the  disk  was  reduced.  The 
sixth  model  was  a  combination  of  the  second  and  the  fourth,  where  the  disk  was  non-symmetric  and  the  stiffness  of 
the  extenor  portion  of  the  disk  was  reduced.  The  seventh  model  was  another  variation  of  the  non-symmetric  disk. 
In  this  model,  the  disk  was  divided  into  eight  equal  sections.  Each  section  was  assigned  a  different  stiffiiess 
(Young’s  Modulus),  such  that  the  disk  would  remain  balanced. 

Three  types  of  mistuning  were  chosen  for  the  investigation.  The  first  two  were;  one  percent  of  the  mass  of  one 
blade  added  to  the  tip  of  one  blade,  and  one  percent  blade  mass  removed  from  the  tip  of  a  single  blade.  This  results 
in  two  cases  of  mistuning  for  the  symmetric  models.  For  non-symmetric  models,  however,  each  type  (addition  and 
subtraction)  must  be  investigated  on  each  of  four  different  blades.  This  results  in  eight  different  cases  of  mistuning 
for  the  non-symmetric  models.  The  preceding  models  representing,  minor  damage  cases  to  an  individual  blade, 
are  in  agreement  with  accepted  practice  in  prior  studies^’’^*.  The  final  type  of  mistuning  cases  is  random 
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mistuning.  Three  random  patterns  were  chosen  such  that  the  mass  added  to  the  tip  of  each  blade  could  vary 
between  plus  or  minus  one  percent  of  the  mass  of  one  blade. 

Methodology-  Models 

All  models  used  in  this  study  were  variations  of  the  symmetric,  constant  stiffness  system.  This  standard  system 
was  constructed  entirely  of  eight  noded  brick  elements  (ANSYS®  element  —  Solid  45),  having  three  degrees  of 
freedom  per  node.  This  element  was  chosen  over  a  plate  element  because  of  the  overlapping  condition  of  the  disk, 
blades,  and  plates  (Figure  7).  The  material  properties  used  were  that  of  mild  steel:  Young’s  Modulus  (E)  =  2(X) 
GPa,  density  (p)  =  7800  kg/m^,  and  Poisson’s  Ratio  (v)  =  0.3.  The  geometry  for  this  model  was  created  in 
AUTC)CAD12®.  Lines  were  also  included  in  the  geometry  that  allowed  meshing  to  be  performed  in  ANSYS®.  A 
proper  mesh  in  this  study  had  two  important  properties.  First,  the  mesh  had  to  be  symmetric.  Due  to  numerical 
sensitivities,  a  non-symmetrical  mesh  could  numerically  induce  undesirable  mode  localization  characteristics  that 
would  not  appear  in  a  symmetrical,  tuned  disk.  Second,  the  mesh  had  to  consist  entirely  of  “brick”  elements  — 
each  element  having  eight  independent  nodes.  These  elements  have  a  tendency  to  behave  to  “stiffly”  when  allowed 
to  degenerate  to  “wedges”  —  elements  in  which  all  eight  nodes  are  not  independent. 


-  SYMMETRY  UNE 

1 

BLADE  1 

DISK 

Figure  7:  Mountii^  condition  of  each  blade  to  the  disk. 

The  second  model  generated  had  a  non-symmetric  mass  distribution  and  axisymmetric  bending  stiffness.  A 
point  mass  element  (ANSYS®  element  —  Mass  21 ),  was  added  to  the  edge  of  the  disk  at  the  centerline  of  each 
blade.  The  masses  were  added  in  a  pattern  that  repeated  after  180°  in  order  to  maintain  the  balance  of  the  system 
(Figure  8).  The  procedure  added  ten  percent  of  the  mass  of  the  disk  to  the  system.  This  is  an  unacceptable 
increase  in  system  mass.  A  more  realistic  implementation  of  this  concept  would  decrease  mass  at  some  locations 
and  increase  mass  in  others,  resulting  in  a  smaller  net  change  in  mass. 
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1  =  0.5  %  disk  mass  =  0.02825  Kg 

2  =  1.0  %  disk  mass  =  0.0565  Kg 

3  =  1.5  %  disk  mass  =  0.08475  Kg 

4  =  2.0  %  disk  mass  =  0.113  Kg 

Figure  8:  Mass  distribution  for  the  mass-added,  non-synunetric  model. 


The  third  and  fourth  models  were  symmetric,  with  reduced  stiffness  at  the  interior  and  exterior  portions  of  the 
disk,  respectively.  For  the  simplicity  of  modehng,  reducing  the  stiffness  was  accomplished  by  reducing  Young’s 
Modulus  (E)  to  a  value  equal  to  half  of  its  original  value  (100  GPa).  In  practice,  similar  reductions  in  stiffness 
would  be  accomplished  by  varying  both  design  materials  and  geometry. 

In  conjunction  with  the  second  model,  the  third  and  fourth  models  were  used  to  create  models  five  and  six. 

This  gave  two  non-symmetric  disks.  One  with  reduced  stiffness  at  the  interior  of  the  disk  and  one  with  reduced 
stiffness  at  the  exterior  of  the  disk. 


1  =  100  %  Young’s  Modulus  =  200  GPa 

2  =  80  %  Young’s  Modulus  =  160  GPa 

3  =  60  %  Young’s  Modulus  =  140  GPa 
4=  40  %  Young’s  Modulus  =  80  GPa 

Figure  9:  Stiffness  distribution  for  the  variable  stiffness,  non-symmetric 
model. 


When  considering  the  second  model,  the  investigators  found  it  to  be  of  considerable  concern  that,  in  practice, 
destroying  the  symmetry  of  the  disk  would  inevitably  lead  to  changes  in  the  geometry.  Not  only  would  such  a 
change  mean  the  addition  or  subtraction  of  mass  in  certain  locales,  but  it  would  also  mean  a  considerable  change  in 
stiffness.  It  was  this  concern  that  brought  about  the  seventh  and  final  model.  In  this  model,  the  disk  was  divided 
mto  eight  equal  sections.  Four  values  of  Young’s  Modulus  were  used  in  these  sections  in  a  repeated  pattern  similar 
to  the  mass  distribution  in  the  second  model  (Figure  9). 
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Methodology-  Mistuning 

Three  basic  types  of  mistuning  were  chosen  to  best  represent  realistic  challenges  to  the  robusmess  of  a  bladed 
disk  design.  The  first  two  were  addition  and  subtraction  of  mass  to  a  single  blade.  These  were  chosen  to  represent 
obstructions  that  nnay  be  “sucked  into”  a  bladed  disk  system  and  either  adhere  to  a  blade  or  chip  the  blade  in 
passing  through  the  system.  The  final  type  of  mistuning  chosen  was  the  random  addition  and  subtraction  of  mass 
from  each  blade,  representative  of  the  small  variance  between  blades  due  to  manufacturing  techniques  and 
tolerances. 

For  the  tuned  systems,  one  percent  of  a  single  blade  mass  was  added  to  the  tip  of  each  blade.  To  simulate  the 
addition  of  one  percent  mass  to  one  blade,  the  mass  of  the  point  element  added  to  that  blade  was  simply  increased 
from  one  percent  blade  mass  to  two  percent  blade  mass.  To  simulate  the  removal  of  mass  from  one  blade,  the  one 
percent  blade  mass  element  was  simply  not  added  to  that  particular  blade.  For  the  symmetric  models  it  was  only 
necessary  to  investigate  the  results  of  adding  mass  to  or  removing  mass  from  a  single  location  (locations  1-4, 
Figures  8  and  9).  However,  for  the  non-symmetric  models  it  was  necessary  to  investigate  the  results  of  adding 
mass  to  or  removing  mass  from  four  different  locations  (locations  1-4,  Figures  8  and  9).  Depending  upon  which 
location  the  mass  was  added  to  or  removed  from,  the  resulting  mode  shapes  and  frequency  distributions  of  the 
system  would  be  different 

Three  random  patterns  of  mistuning  were  also  investigated.  As  mentioned  before,  the  addition  of  one  percent 
blade  mass  to  each  blade  was  chosen  as  the  tuned,  or  nominal  case.  In  determining  the  random  patterns  to  be  used, 
the  mass  to  be  added  to  any  location  was  allowed  to  vary  from  zero  to  two  percent  of  a  single  blade  mass.  This 
represents  a  tolerance  of  plus  or  minus  one  percent,  with  one  percent  of  the  mass  of  one  blade  being  the  nominal 
value.  The  random  point  mass  distributions  to  be  used  were  generated  in  matlab®  using  the  randn  function  to 
generate  three  mistuning  patterns,  each  with  a  normal  distribution  about  the  nominal  value.  The  patterns 
generated  follow  in  Table  2.  This  type  of  statistical  representation  (normal  distribution  about  a  nominal  value)  is 
widely  used  and  accepted  when  incorporating  manufacturing  tolerances  into  design  analysis. 


Blade  Location 

Random  1 

Random  2 

Random  3 

# 

E-3  Ke 

E-3  Kg 

E-3  Kg 

1 

0.0168 

0.3343 

0.3413 

2 

0.0260 

0.2867 

0.4431 

3 

0.2579 

0.4529 

0.3710 

4 

0.3267 

0.4119 

0.1278 

5 

0.0037 

0.2565 

0.0231 

6 

0.1866 

0.0448 

0.3583 

7 

0.0325 

0.3183 

0.1598 

8 

0.2032 

0.2025 

0.3080 

Table  2:  Random  mistuning  mass  distributions. 
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Methodology-  Analysis 


Modal  analysis  was  performed  on  all  seven  models  for  the  tuned  case  and  for  each  case  of  mistuning  in 
ANSYS  ,  version  5.2.  In  each  case,  the  inner  surface  of  the  disk  was  constrained  to  have  zero  displacement  at  all 
degrees  of  fiteedom.  To  minimize  computer  time,  a  lumped  mass  matrix  formulation  was  used.  The  Subspace 
iteration  method  was  used  for  eigenvalue  extraction,  with  the  tolerance  for  convergence  checking  set  to  lE-5.  The 
first  twenty-four  modes  were  extracted  for  all  cases. 


Methodology-  Measures 


All  cases  of  mistuning  were  investigated  by  application  to  each  of  the  seven  models.  In  each  case  the  localized 
modes  were  compared  to  the  nominal  system.  This  resulted  in  the  analysis  of  seven  tuned  systems  and  59 
mistuned  systems.  The  results  of  the  study  were  quantified  using  some  standard  measures.  In  addition,  some  new 
and  unique  measures  were  also  developed  to  compare  the  tuned  and  mistuned  cases. 

Most  studies  concentrate  on  the  first  group  of  modes,  corresponding  to  the  first  beam  bending  mode  of  a  single 
blade.  This  is  done  for  two  reasons:  first,  and  foremost,  is  the  relative  simplicity  in  mathematical  formulation 
when  assuming  one  general  shape  of  deformation;  second,  is  that  mode  localization  tends  to  occur  to  a  greater 
degree  in  the  lower  modes^.  When  looking  at  only  the  first  beam  bending  modes,  it  is  sufficient  to  quantify  the 
results  by  looking  at  the  displacement  at  the  Ups  of  the  blades.  In  this  particular  mode  of  deformaUon  there  is  a 
direct  correlation  between  the  tip  displacement  of  a  blade  and  the  amount  of  stress  at  its  root.  This  is  not  true  for 
higher  modes.  To  properly  investigate  the  higher  modes,  some  measure  of  energy  or  stress  was  needed. 

ANSYS®  5.2  has  the  built  in  capabiUty  to  determine  relative  Principal  and  Von  Mises  Stresses  at  each  node.  It 
IS  important  to  notice  that  these  stresses  are  qualified  as  relative  because  there  is  no  forcing  of  the  system  in  modal 
analysis.  This  capability  was  utilized  in  detennining  a  meaningful  measure  for  presenting  the  results  of  this  study. 
It  is  intuitive  that  the  blade  with  the  highest  value  of  Von  Mises  Stress  (VM)  will  be  the  one  where  the  greatest 
mode  localization  takes  place.  To  obtain  a  true  measure  of  mode  localization,  this  maximum  value  of  stress  was 
compared  to  the  average  value  of  stress  for  the  entire  system,  yielding  a  stress  ratio  .  y  Here,  Rt  represents  the 
stress  ratio  for  the  tuned  case  and  Rn,  represents  the  stress  ratio  for  the  mistuned  cases. 


(a) 


VMmax, tuned 
'^avg, tuned 


(b) 


^^max,  mistuned 
^^avg.mistuned 


Finally,  and  most  importantly,  the  mistuned  response  (RJ  of  each  model  was  compared  to  that  of  the  other 
models  for  each  type  of  mistuning.  This  is  done  graphically  and  can  be  found  in  Figures  13-24  (pp.  20-31)  .It  is 
from  these  graphs,  that  the  conclusions  are  drawn  by  the  investigators. 

Some  mathematical  manipulation  was  required  to  justify  the  comparison  of  results  between  different  models. 
ANSYS  normalizes  the  resulting  eigenvectors  (mode  shape  vectors)  by  the  mass  matrix  in  the  following  manner: 
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where,  {4>}j  is  the  ith  mode  shape  eigenvector,  and  [M]  is  the  system  mass  matrix.  The  relative  stresses  are 

then  calculated  using  these  normalized  eigenvectors.  This  creates  a  problem  in  comparing  eigenvectors  or  stress 
vectors  of  different  models,  since  for  different  models,  these  vectors  are  normalized  to  different  mass  matrices.  To 
remedy  this  simation,  both  the  eigenvectors  and  the  stress  vectors  were  normalized  to  unit  length. 

When  mistiming  occuTS  in  a  given  system,  it  is  important  to  realize  that  groups  of  modes  remain  grouped.  In 
fact,  the  mistuned  mode  shapes  are  shown  to  be  linear  combinations  of  the  tuned  mode  shapes.  This  is  verified  by 
the  following  calculation. 

[‘ftfKi-  m 

24xn  nx24  24x24 


where,  [OJ  and  [OJ  are  the  matrix  of  tuned  eigenvectors  (mode  shapes)  and  matrix  of  mistuned  eigenvectors 
(mode  shapes),  respectively.  The  variable  [T']  is  defined  as  the  transformation  matrix  and  n  is  the  number  of 
degrees  of  freedom.  This  results  in  an  m  x  m  matrix,  where  m  is  the  number  of  modes  examined  (m=24,  in  this 
case).  The  transformation  matrix  can  then  be  divided  into  three  submatrices,  corresponding  to  the  three  modal 


spaces  (modal  groups)  being  investigated. 


['¥] 

24x24 


'i'2 

Sx8 


The  determinate  of  each  of  the  three  submatrices  should  have  a  magnitude  of  one,  if  the  mistuned  mode  shapes 
are  indeed  linear  combinations  of  the  corresponding  tuned  mode  shapes  for  each  modal  group.  Each  submatrix 
can  also  be  examined  for  the  contribution  of  each  of  the  tuned  mode  shapes  to  each  of  the  mistuned  mode  shapes. 
Here,  each  column  represents  a  mistuned  mode  shape,  and  each  row  represents  a  tuned  mode  shsqje.  In  the  present 
example  ('Pi,  model  1,  mistuning  case  1),  it  is  easily  seen  that  the  second  mistuned  mode  shape  is  comprised 
mainly  of  the  first  and  third  tuned  mode  shapes.  The  seventh  mistuned  mode  shape  is  comprised  almost  entirely  of 
the  sixth  tuned  mode  shape,  and  so  on. 


0.315 

0.734 

0.451 

0360 

-0 

0.159 

0.005 

-0.040' 

0.167 

0.371 

-0.892 

0.173 

-0 

0.089 

-0.003 

-0.025 

0.510 

-0.570 

-0 

0382 

~0 

0.268 

-0 

-0.070 

0305 

-0 

0.015 

-0.689 

-0 

0.509 

-0.007 

-0.103 

-0 

-0 

~0 

-0 

-1 

-0 

-0 

-0 

-0.007 

0.003 

0.005 

0.007 

~0 

-0 

-1 

-0.010 

0.490 

-0.003 

-0 

-0.135 

~0 

-0.754 

~0 

-0.417 

0.343 

-0.003 

-0.003 

-0.075 

-0 

-0.261 

-0.012 

0.900 
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Along  with  the  resulting  change  in  mode  sh^es,  it  is  equally  important  to  examine  the  resulting  changes  in  the 
natural  frequencies  of  the  mistuned  system  relative  to  those  of  the  tuned  system.  In  the  tuned  system,  especially 
systems  in  which  the  disk  is  symmetric,  all  natural  frequencies  of  each  group  are  within  a  very  small  percentage  of 
each  other.  In  the  mistuned  models,  some  of  the  natural  frequencies  in  each  group  can  move  considerably  away 
from  the  mean  frequency  of  that  group  (this  phenomenon  is  commonly  known  as  eigenvalue  veering). 


Mode 

Tuned 

Cases  1-4 

Cases  5-8 

Case  9 

Case  10 

Case  11 

# 

Hz 

Hz 

Hz 

Hz 

Hz 

Hz 

1 

336.8 

330.64 

336.8 

334.74 

331.49 

331.74 

2 

336.8 

336.8 

336.8 

336.56 

332.55 

333.6 

3 

336.8 

336.8 

336.85 

338.01 

334.54 

333.92 

4 

336.9 

336.85 

336.9 

338.45 

334.96 

334.36 

5 

336.9 

336.9 

337 

342.68 

335.8 

335.24 

6 

337.07 

337 

337.07 

342.83 

336.59 

339.18 

7 

337.07 

337.07 

337.13 

343.15 

338.03 

340.05 

8 

337.15 

337.13 

343.59 

343.49 

342.34 

342.94 

9 

1411.4 

1411.3 

1411.4 

1411.4 

1411.3 

1411.3 

10 

1411.4 

1411.4 

1411.4 

1411.4 

1411.3 

1411.4 

11 

1411.4 

1411.4 

1411.4 

1411.4 

1411.4 

1411.4 

12 

1411.5 

1411.5 

1411.5 

1411.5 

1411.5 

1411.5 

13 

1411.5 

1411.5 

1411.5 

1411.6 

1411.5 

1411.5 

14 

1411.8 

1411.8 

1411.8 

1411.9 

1411.8 

1411.8 

15 

1411.8 

1411.8 

1411.9 

1411.9 

1411.8 

1411.9 

16 

1412 

1412 

1412 

1412 

1412 

1412 

17 

2066.1 

2038.4 

2066.5 

2059.6 

2040.9 

2042.6 

18 

2066.6 

2066.5 

2066.8 

2073.1 

2051 

2055.1 

19 

2066.7 

2066.8 

2069.5 

2078.6 

2057.8 

2056.1 

20 

2072.6 

2070.3 

2072.8 

2081 

2061.3 

2059 

21 

2072.8 

2072.8 

2076.9 

2104.3 

2070.6 

2067.2 

22 

2081.4 

2078.5 

2081.6 

2108.9 

2072.5 

2088.6 

23 

2081.6 

2081.6 

2083.7 

2113.9 

2082.8 

2091.3 

24 

2084.6 

2083.9 

2114.1 

2115.4 

2106.7 

2110.9 

Table  3:  Frequency  table  for  Model  1  (synunetric  mass  distribution,  axisymmetric  stiffness). 


In  the  case  of  the  tuned  symmetric  disk,  the  frequencies  in  each  group  are  all  within  one  percent  of  each  other. 
This  is  no  longer  the  case  when  the  system  is  mistuned.  In  mistuning  cases  1-4  (addition  of  mass  to  one  blade),  the 
first  frequency  of  the  first  mistuned  group  deviates  considerably  from  the  rest  of  its  group.  In  cases  5-8  (removal  of 
mass  from  one  blade),  it  is  the  last  frequency  of  that  group  that  deviates.  There  is  no  single  frequency  “leaving” 
the  group  in  the  random  mistuning  cases  (9-11),.  Instead,  the  frequencies  of  that  group  are  dispersed  over  a  larger 
range.  It  should  be  noted  here  that,  in  the  case  of  mass  addition  or  removal,  the  mode  in  which  frequency  deviates 
most  from  the  group  is  also  the  mode  which  exhibits  the  strongest  mode  localization.  A  complete  set  of  frequency 
tables  for  each  model  can  be  found  in  Tables  4-10  (pp.  32-37). 


Results 

It  was  the  goal  of  the  investigators  to  minimize  the  detrimental  effects  of  mode  localization  by  altering  the 
symmetry  and/or  stiffness  of  the  disk.  Individual  blade  damage  (addition  and  subtraction  of  mass)  were 


20-16 


investigated,  as  well  as  random  mistiming  representative  of  small  manufacturing  variances.  The  results  of  this 
study  are  presented  in  Tables  4-10  (pp.  32-37)  and  most  significantly.  Figures  12-23  (pp.  20-31).  The  results  of 
this  smdy  may  prove  to  be  very  useful  in  the  future  design  of  bladed  disk  assemblies,  especially  when  the  higher 
modes  of  vibration  are  of  particular  concern. 

In  the  examination  of  the  stress  ratio,  it  is  important  to  first  look  at  the  tuned  models.  In  the  case  of  the 
symmetric  disk  with  axisymmetric  stiffness  (the  baseline  model  to  which  all  design  modifications  are  compared), 
stress  ratio  values  (Rt)  range  between:  9-11  for  the  first  modal  group,  3-7  for  the  second  modal  group,  and  5-8  for 
the  third  modal  group.  The  proposed  design  changes  offered  little  improvement  in  the  stress  ratio  for  the  tuned 
case.  In  fact,  the  selected  methods  of  destroying  the  disk’s  symmetry  actually  induced  mode  localization, 
increasing  the  value  of  the  stress  ratio.  However,  this  result  is  not  of  dire  consequence  to  future  design,  because  the 
perfectly  tuned  case  will  never  be  a  practicality. 

The  results  of  the  mistuned  cases  are  best  summarized  by  looking  at  each  case,  one  modal  group  at  a  time.  The 
first  modal  group  corresponds  to  the  first  beam  bending  shape  of  an  individual  blade.  In  the  cases  of  blade 
damage,  one  mode  of  this  group  showed  a  significant  increase  in  stress  ratio.  When  mass  was  added,  the  first 
mode  of  the  group  showed  a  significant  inaease  in  stress  ratio.  However,  when  mass  was  removed,  the  last  mode 
of  the  group  showed  a  significant  increase  in  stress  ratio.  This  mode  localization  is  clearly  visible  when  the  same 
mode  is  plotted  for  the  tuned  and  mistuned  cases,  using  the  same  deformation  scale  for  each  (Figures  10  and  11,  p. 
19).  None  of  the  proposed  changes  offered  a  significant  improvement  in  the  response  of  this  highly  localized 
mode.  The  techniques  employed  to  destroy  the  symmetry  of  the  disk  even  caused  one  or  more  other  modes  to  show 
an  inaease  in  stress  localization.  For  the  three  random  cases  of  mistuning,  all  eight  modes  of  the  group  were 
found  to  have  high  levels  of  stress  localization.  Destroying  the  symmetry  of  the  disk  causes  sporadic  improvement 
of  stress  levels  throughout  this  group.  Most  importantly,  reducing  the  stiffness  of  the  inner  portion  of  the  disk, 
results  in  consistent  lowering  of  the  stress  level  throughout  the  entire  group  (relative  to  the  mistuned  baseline 
system). 

The  second  modal  group  corresponds  to  the  first  torsional  group  of  an  individual  blade.  For  all  cases  of 
mistuning,  the  stress  levels  of  this  group  were  only  slightly  raised.  Again  the  destruction  of  symmetry  was  shown 
to  have  a  negative  affect  on  the  system.  Although  some  sporadic  improvement  was  shown  by  changing  both  the 
interior  and  exterior  stiffnesses  of  the  disk,  no  consistent  improvement  was  shown  in  this  modal  group  by 
implementing  any  of  the  proposed  design  changes. 

The  most  interesting  results  are  found  in  the  examination  of  the  third  modal  group,  corresponding  to  the  second 
beam  bending  mode  of  an  individual  blade.  In  the  cases  of  blade  damage,  there  was  only  a  single  highly  localized 
mode,  similar  to  the  results  obtained  for  the  first  modal  group.  The  only  proposed  design  change  that  did  not  offer 
significant  improvement  in  the  blade  damage  cases  was  the  non-synunetric  (mass  added),  axisynunetric  disk 
stiffness  model.  Reducing  the  interior  disk  stiffness  seemed  to  trigger  the  most  improvement.  When  this  was 
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coupled  with  the  destruction  of  symmetry,  via  the  addition  of  mass,  the  maximum  stress  ratio  dropped  to  as  little  as 
one-fourth  that  of  the  mistuned  baseline  system.  In  the  random  mistuning  cases  all  proposed  design  changes 
offered  some  improvement,  although  the  results  of  the  mass  added  destruction  of  symmetry  were  rather 
inconsistent.  Again,  the  reduction  of  the  interior  disk  symmetry  seemed  to  be  the  most  beneficial  design 
modification.  Implementing  both  the  reduction  of  the  interior  disk  stiffness  and  mass  added  destruction  of 
symmetry  yielded  the  best  results.  Here,  the  stress  ratio  dropped  to  as  little  as  one-half  that  of  the  baseline  system. 

Conclusion 

Reducing  the  interior  disk  stiffness  relative  to  the  exterior  disk  stiffness  can  dramatically  improve  the 
performance  of  bladed  disk  systems  in  the  presence  of  mistuning.  Upon  examination  of  the  different  responses  in 
the  third  modal  group,  one  can  also  conclude  that  there  is  some  promise  in  the  prospect  of  destroying  the  symmetry 
of  the  disk.  Although  no  great  improvement  was  shown  in  the  responses  of  the  first  and  second  modal  groups,  the 
dramatic  improvement  shown  in  the  third  modal  group  is  enough  to  warrant  consideration  for  the  design  changes 
proposed. 

Only  one  pattern  of  mass  addition  was  used  in  this  study  to  represent  a  non-symmetric  disk.  The  addition  of 
this  mass,  in  itself,  was  shown  to  induce  significant  levels  of  mode  localization.  The  most  promising  results  were 
shown  when  destroying  the  disk’s  symmetry,  via  the  addition  of  mass,  was  coupled  with  a  reduced  interior  disk 
stiffness.  It  is  quite  possible  that  the  amount  of  mass  added  to  create  a  non-symmetric  disk  was  too  drastic.  The 
investigators  speculate  that  the  positive  effects  of  reducing  the  interior  disk  stiffness  was  enough  to  overcome  the 
negative  effects  caused  by  destroying  the  symmetry.  Had  the  amount  of  mass  added  to  the  disk  been  less  severe, 
the  implementation  of  a  non-symmetric  disk,  in  itself,  may  have  proven  beneficial. 

Future  Study 

The  present  study  featured  seven  different  types  of  models.  Two  types,  disks  with  reduced  interior  stiffness  and 
mass  added  non-symmetric  disks,  yielded  promising  results  and  are  worthy  of  further  investigation.  Based  on  the 
conclusions  of  previous  authors,  it  is  the  authors’  belief  that  reducing  the  stiffness  of  the  outer  portion  of  the  disk  is 
also  worthy  of  further  investigation.  Future  work  should  include: 

1.  Investigation  of  different  levels  of  relative  stiffness  between  the  interior  and  exterior  portions  of  the  disk. 

2.  Investigation  of  different  patterns  and  mass  amounts  used  to  destroy  the  symmetry  of  the  disk. 

3.  Investigation  of  combinations  of  the  previous  two. 

4.  Formulation  of  new  quantitative  measures  to  summarize  investigative  results. 
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Figure  10:  Model  1  (baseline),  eighth  mode,  tuned  case 


Figure  11:  Model  1  (baseline),  eighth  mode,  1%  blade  mass  removed  from  blade  1. 


Tuned  Stress  Ratio  (Rt)  v.  Mode  Number  Tuned  Stress  Ratio  (Rt)  v.  Mode  Number 
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;ure  12:  Tuned  Case:  (a)  First  modal  group,  (b)  Second  modal  group,  (c)  Third  modal  group. 
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Figure  13:  Mistuning  Case  1  (1  %  Mass  added  to  blade  1):  (a)  First  modal  group,  (b)  Second  modal  group,  (c)  Third  modal  group. 


Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 


20-22 


Mode  Number 

Figure  14:  Mistunii^  Case  2  (1  %  Mass  added  to  blade  2):  (a)  First  modal  group,  (b)  Second  modal  group,  (c)  Third  modal  group. 


Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  RaUo  (Rm)  v.  Mode  Number 
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Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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Figure  17:  Mistuning  Case  5  (1%  Mass  removed  from  blade  1):  (a)  First  modal  group,  (b)  Second  modal  group,  (c)  Third  modal  group. 


Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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Figure  19:  Mistuning  Case  7(1%  Mass  removed  from  blade  3):  (a)  First  modal  group,  (b)  Second  modal  group,  (c)  Third  modal  group. 
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group,  (b)  Second  modal  group,  (c)  Third  modal  group. 


Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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group,  (c)  Third  modal  group. 


Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number  Mistuned  Stress  Ratio  (Rm)  v.  Mode  Number 
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Mode 

Tuned 

Cases  1-4  Cases  5-8 

Case  9 

Case  10 

Case  11 

# 

Hz 

Hz  Hz 

Hz 

Hz 

Hz 

1 

336.8 

330.64  336.8 

334.74 

331.49 

331.74 

2 

336.8 

336.8  336.8 

336.56 

332.55 

333.6 

3 

336.8 

336.8  336.85 

338.01 

334.54 

333.92 

4 

336.9 

336.85  336.9 

338.45 

334.96 

334.36 

5 

336.9 

336.9  337 

342.68 

335.8 

335.24 

6 

337.07 

337  337.07 

342.83 

336.59 

339.18 

7 

337.07 

337.07  337.13 

343.15 

338.03 

340.05 

8 

337.15 

337.13  343.59 

343.49 

342.34 

342.94 

9 

1411.4 

1411.3  1411.4 

1411.4 

1411.3 

1411.3 

10 

1411.4 

1411.4  1411.4 

1411.4 

1411.3 

1411.4 

11 

1411.4 

1411.4  1411.4 

1411.4 

1411.4 

1411.4 

12 

1411.5 

1411.5  1411.5 

1411.5 

1411.5 

1411.5 

13 

1411.5 

1411.5  1411.5 

1411.6 

1411.5 

1411.5 

14 

1411.8 

1411.8  1411.8 

1411.9 

1411.8 

1411.8 

15 

1411.8 

1411.8  1411.9 

1411.9 

1411.8 

1411.9 

16 

1412.0 

1412  1412 

1412 

1412 

1412 

17 

2066.1 

2038.4  2066.5 

2059.6 

2040.9 

2042.6 

18 

2066.7 

2066.5  2066.8 

2073.1 

2051 

2055.1 

19 

2066.7 

2066.8  2069.5 

2078.6 

2057.8 

2056.1 

20 

2072.8 

2070.3  2072.8 

2081 

2061.3 

2059 

21 

2072.8 

2072.8  2076.9 

2104.3 

2070.6 

2067.2 

22 

2081.4 

2078.5  2081.6 

2108.9 

2072.5 

2088.6 

23 

2081.4 

2081.6  2083.7 

2113.9 

2082.8 

2091.3 

24 

2084.6 

2083.9  2114.1 

2115.4 

2106.7 

2110.9 

Table  4;  Frequency  table  for  Model  1  (symmetric  mass  distribution,  axisymmetric  stiffness). 
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Mode 

Tuned 

Cases  1-4 

Cases  5-8 

Case  9 

Case  10 

Case  11 

# 

Hz 

Hz 

Hz 

Hz 

Hz 

Hz 

I 

336.39 

330.39 

336.39 

334.47 

331.22 

331.47 

2 

336.39 

336.39 

336.39 

336.32 

332.32 

333.36 

3 

336.39 

336.39 

336.51 

337.74 

334.26 

333.67 

4 

336.63 

336.52 

336.63 

338.19 

334.7 

334.09 

5 

336.63 

336.63 

336.82 

342.42 

335.58 

335.02 

6 

336.98 

336.84 

336.98 

342.51 

336.34 

338.94 

7 

336.98 

336.98 

337.07 

342.97 

337.8 

339.79 

8 

337.1 

337.07 

343.34 

343.24 

342.09 

342.7 

9 

1411.1 

1411.1 

1411.1 

1411.1 

1411.1 

1411.1 

10 

1411.1 

1411.1 

1411.1 

1411.2 

1411.1 

1411.1 

11 

1411.1 

1411.1 

1411.1 

1411.2 

1411.1 

1411.1 

12 

1411.1 

1411.1 

1411.2 

1411.2 

1411.1 

1411.2 

13 

1411.1 

1411.2 

1411.2 

1411.3 

1411.2 

1411.2 

14 

1411.7 

1411.6 

1411.7 

1411.7 

1411.6 

1411.6 

15 

1411.7 

1411.7 

1411.7 

1411.7 

1411.7 

1411.7 

16 

1412 

1411.9 

1412 

1412 

1411.9 

1412 

17 

2031.4 

2012.7 

2031.7 

2030 

2010.9 

2012.4 

18 

2031.9 

2031.7 

2032.1 

2047.2 

2024.2 

2025.9 

19 

2031.9 

2032.1 

2038.8 

2050.2 

2031.2 

2030.8 

20 

2055.8 

2049.2 

2055.9 

2061.1 

2040.9 

2044 

21 

2055.8 

2055.9 

2062.4 

2081.2 

2055.4 

2058.1 

22 

2078.4 

2072.3 

2078.5 

2097.2 

2064.3 

2070.8 

23 

2078.4 

2078.5 

2081.7 

2105.9 

2076.6 

2081.6 

24 

2083.6 

2082.4 

2103.8 

2110.5 

2096.7 

2102.5 

Table  6:  Frequency  table  for  Model  3  (Symmetric  mass  distribution,  reduced  interior  stiffness). 


Mode 

# 

T 

2 

3 

4 

5 

6 
7 
g 

9 

10 
11 
12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

Table  7: 


Tuned 

Cases  1-4 

Cases  5-8 

Hz 

Hz 

Hz 

336.06 

329.91 

336.06 

336.06 

336.06 

336.06 

336.06 

336.06 

336.1 

336.14 

336.11 

336.14 

336.14 

336.14 

336.23 

336.3 

336.24 

336.3 

336.3 

336.3 

336.36 

336.38 

336.36 

342.81 

1409.5 

1409.5 

1409.5 

1409.5 

1409.5 

1409.5 

1409.5 

1409.5 

1409.6 

1409.8 

1409.7 

1409.8 

1409.8 

1409.8 

1409.8 

1410.1 

1410.1 

1410.1 

1410.1 

1410.1 

1410.2 

1410.3 

1410.3 

1410.3 

2058.2 

2030.5 

2058.4 

2058.8 

2058.4 

2058.8 

2058.8 

2058.8 

2061.3 

2064.3 

2062 

2064.3 

2064.3 

2064.3 

2068.6 

2073.2 

2070.2 

2073.2 

2073.2 

2073.2 

2075.7 

2076.7 

2075.9 

2105.5 

Frequency  table  for  Model  4  (Symmeb-ic 


Case  9  Case  10  Case  11 

Hz  Hz  Hz 

333.99  330.76  331.01 

335.8  331.81  332.85 

337.25  333.79  333.18 

337.69  334.21  333.61 

341.9  335.05  334.49 

342.06  335.83  338.42 

342.36  337.27  339.28 

342.71  341.56  342.16 

1409.5  1409.4  1409.4 

1409.6  1409.5  1409.5 

1409.6  1409.5  1409.5 

1409.8  1409.7  1409.7 

1409.8  1409.8  1409.8 

1410.2  1410.1  1410.1 

1410.2  1410.1  1410.2 

1410.4  1410.3  1410.3 

2051.5  2033.1  2034.8 

2064.9  2043  2047 

2070.3  2049.8  2048.1 

2072.7  2053.3  2050.8 

2095.7  2062.3  2059.1 

2100.2  2064.4  2080.3 

2105.4  2074.5  2082.9 

2106.9  2098.2  2102.3 


mass  distribution,  reduced  exterior  stiffness). 
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Mode  Tuned  Case  1  Case  2  Case  3  Case  4  Case  5  Case  6  Case?  Case  8  Case  9  Case  10  Case  11 
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Abstract 


A  system  for  the  molecular  beam  epitaxy  (MBE)  of  SiC  thin  films  was  installed  in  the 
Nanoelectronics  Laboratory  at  University  of  Cincinnati.  The  MBE  system  combines 
several  gas  and  solid  sources.  Preliminary  results  of  SiC  heteroepitaxial  growth  by  MBE 
were  obtained.  We  have  demonstrated  the  MBE  growth  of  SiC  on  Si  and  semiconductor- 
on-insulator  (SOI)  substrates.  SiC  growth  was  obtained  by  either  propane  carbonization  of 
the  Si  surface  or  by  the  pyrolysis  of  silacyclobutane  (SCB),  or  the  sequential  use  of 
carbonization  and  pyrolysis.  At  a  growth  temperature  of  800  “C,  initial  experiments 
indicate  a  SiC  growth  rate  ~0.1  A/s.  The  crystallinity  of  the  film  surface  was  investigated 
using  reflection  high  electron  energy  diffraction  (RHEED).  Films  grown  under  certain 
conditions  produce  RHEED  patterns  indicating  crystalline  cubic  (3C)  SiC,  while  under 
other  conditions  a  RHEED  pattern  is  observed  indicating  a  combination  of  crystalline  and 
poly-crystalline  SiC.  The  thickness  and  composition  of  the  SiC  films  was  analyzed  by 
secondary  ion  mass  spectrometry  (SIMS).  SIMS  depth  profiles  indicate  that  no  (or  very 
low)  contamination  from  N,  O,  and  B  was  found  in  the  films. 


I 
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1.  Introduction 


Silicon  carbide  (SiC)  is  a  wide  band  gap  semiconductor  with  high  thermal  stability 
and  conductivity,  high  breakdown  voltage,  etc.  Among  many  SiC  polytypes,  3C-SiC  (or 

P-SiC,  the  only  cubic  structure)  is  the  most  promising  candidate  for  higher  power,  higher 

temperature  and  higher  frequency  electronic  devices  because  of  its  high  electron  mobility 
(1000  cmW  s)[l],  high  saturated  drift  velocity  (above  10’  cm/s)[2].  Since  bulk  crystals  of 
3C-SiC  are  very  expensive  and  their  size  is  very  small  (~3  -  5  mm),  heteroepitaxial  growth 
on  Si  substrate  has  become  an  alternative  method  for  growing  SiC.  Due  to  the  large  lattice 
constant  mismatch  (20%)  and  the  thermal  expansion  coefficient  difference  (8%)  between 
3C-SiC  and  Si,  considerable  effort  was  devoted  to  improving  the  film  quality,  leading  to  a 
two-step  epitaxial  growth.  The  first  step  is  carbonization  of  Si  substrate  to  relieve  the  strain 
between  SiC  and  Si,  but  this  SiC  layer  is  usually  too  thin  (a  few  hundred  A)  to  fabricate 
devices.  The  second  step  is  essential  growth  of  SiC  on  SiC  by  introducing  both  Si  and  C 
precursors.  Currently,  CVD  is  widely  employed  to  grow  epitaxial  3C-SiC  on  Si.  Since  the 
epitaxial  growth  temperature  in  CVD  is  generally  higher  than  1200  °C,  which  can  cause 
deterioration  of  the  film  quality  and  redistribution  of  the  dopants,  reduction  of  growth 
temperatures  must  be  realized  in  order  to  fabricate  the  SiC  device.  Molecular  Beam  Epitaxy 
(MBE)  is  a  promising  methods  to  reduce  3C-SiC  growth  temperatures. 

The  work  described  in  this  report  was  partially  sponsored  by  Research  and 
Development  Laboratories  as  a  Supplemental  Research  Extension  Program  from  AFOSR. 
The  following  sections  of  this  report  consist  of:  a  description  of  the  Riber  32  MBE  system 
recently  installed  at  the  University  of  Cincinnati;  a  brief  summary  of  previous  work  on 
growing  SiC  with  MBE;  preliminary  results  on  MBE  growth  of  3C-SiC  on  Si  at 
Cincinnati. 


2.  System  Description 

The  MBE  system  consists  of  these  primary  components;  a  control  computer  (1), 
control  electronics  and  power  supplies(2),  gas  cabinet  (3),  growth  chamber  (4),  a  load  lock 
chamber  (5),  pumping  systems.  A  schematic  of  the  system  is  shown  in  Fig.  1. 
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2.1  Load  Lock 


The  load  lock  (5)  serves  as  a  buffer  chamber  between  atmosphere  and  the  growth 


Fig.  1.  Schematic  of  University  of  Cincinnati  MBE  system  from  Riber. 


chamber  which  maintains  UHV  conditions.  It  contains  a  magnetically  coupled  sample 
transfer  rod  and  a  four-sample  storage  stage  with  one  outgassing  station  that  can  reach 
temperature  up  to  800°C.  The  stage  can  move  in  XYZ  directions  and  rotate  360  “C.  The 
load  lock  is  pumped  by  an  ion  pump  (200  1/s)  supplemented  with  a  titanium  sublimation 
pump  (1000 1/s).  The  base  pressure  of  the  chamber  can  reach  IxlO  '*^  Torr  after  a  48  hour 
bakeout  at  200°C.  One  UHV  port  for  venting  the  chamber  with  UHP  nitrogen  and  for 
rough  pumping  is  located  just  below  the  ion  pressure  gauge.  There  is  a  gate  valve  to  isolate 
the  load  lock  chamber  from  the  ion  pump  during  venting  and  roughing,  and  another  gate 
valve  to  isolate  the  load  lock  from  the  growth  chamber.  On  one  side  of  the  load  lock 
chamber  is  a  back  door  for  loading  and  unloading  samples.  The  load  lock  has  additional 
ports  for  future  expansion  with  a  surface  analysis  chamber.  A  photo  of  the  UC  MBE-32 
load  lock  is  shown  in  Fig.  2. 

2.2  Rough  Pumping 

The  rough  pumping  of  the  load  lock  and  the  growth  chamber  is  done  with  a  “rough 
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pump  cart”  which  consists  of  two  LNj-cooled  sorption  pumps  in  parallel  with  a  small 
mechanical  pump  (Fig.  3).  The  sorption  pumps  are  first  cooled  to  77K  and  the  mechanical 


Fig.  2.  Photo  of  load  lock  with  control  electronics  in  background. 


Fig.  3.  Rough  pump  cart  with  mechanical  and  sorption  pumps, 
pump  is  allowed  to  rough  out  the  load  lock  to  150  Torr.  Next,  the  two  sorption  pumps 


are  opened  and  closed  in  series  until  the  pressure  reaches  5x10'^  Ton  and  1x10'^  Ton, 
respectively.  At  that  point,  the  chamber  is  isolated  and  the  gate  valve  is  opened  to  the  main 
pumping  system. 

2.3  Growth  Chamber 

The  growth  chamber  (4)  (shown  in  Fig.  4)  contains  a  sample  manipulator,  a  cell 
panel  where  the  molecular  sources  are  located,  analysis  tools  (RHEED,  RGA),  LN2 
cryoshrouds,  and  UHV  pumping.  The  transfer  rod  places  the  sample  directly  onto  a  xyz- 
controllable  and  rotatable  manipulator  (Fig.  5)  that  has  heating  capability  to  1200“C  via  a 
specially  designed  heater  from  Karl  Eberl  (Stuttgart,  Germany).  UHV  pumping  consists  of 
a  CTI CT8  cryopump  and  a  titanium  sublimator.  Pressures  can  go  as  low  as  5x10  ”  Ton- 
after  a  48  hour  bakeout  and  1.2x10  "  Torr  with  the  cryopanels  cooled.  The  cryopanels 
consist  of  three  LN2-filled  shrouds  for  cooling  the  main  chamber,  the  well  surrounding  the 
pumping  systems,  the  cell  panel  to  protect  the  sources  from  stray  contaminants.  Two  ion 
gauges  monitor  the  pressure  in  the  system,  one  close  to  the  pumping  well  and  the  second  at 
the  rear  of  the  substrate  holder  to  calibrate  the  fluxes  from  the  cells  incident  on  the  sample. 


Fig.  4.  Main  growth  chamber  (on  right)  with  all  attachments.  RGA  (upper  black  box)  and 
RHEED  gun  (just  below)  are  shown  in  foreground. 
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Fig.  5.  Riber  MBE-32  manipulator. 


The  cell  panel  (Fig.  6)  contains  both  Knudsen  effusion  cells  and  a  high  temperature  gas 
injector.  In  the  near  future,  the  cell  panel  will  also  be  fitted  with  a  plasma  source  to  further 
to  enable  the  growth  of  El-nitride  wide  band-gap  semiconductors.  Also  on  the  cell  panel  is 
a  viewport  for  a  pyrometer  (to  be  retrofitted),  an  XTC  crystal  thickness  monitor,  and 
shutter  motors.  The  XTC  is  employed  to  measure  film  deposition  rate  from  an  electron 
beam  gun  used  to  melt  source  material.  Analysis  in  the  chamber  is  provided  by  an  Inficon 
residual  gas  analyzer  (RGA)  and  a  Staib  Instruments  35  kV  reflection  high  energy  electron 
diffraction  (RHEED)  gun.  The  RGA  is  used  to  determine  the  background  contaminants  in 
the  chamber  and  the  process  species  during  growth.  The  RHEED  gun  is  used  to  determine 
the  crystallinity  and  growth  rates  on  the  substrate.  This  will  be  discussed  in  a  later  section. 

2.4  Sources 

Fig.  6  shows  the  setup  of  the  cell  panel  and  location  of  the  sources  on  the  MBE 
system.  Ports  1,5-8  are  fitted  with  Knudsen  cells  (shown  schematically  in  Fig.  7)  and  to  be 
filled  with  solid  source  ultra  high  purity  (UHP)  material  for  evaporation.  An  EPI  high 
temperature  cell  resides  in  port  1  capable  of  temperatures  near  2000°C  and  is  scheduled  to 
be  filled  with  Er.  Since  port  1  is  in  a  low  angle  position  (5”),  the  crucible  can  only  hold  a 
small  charge  in  comparison  to  the  bottom  row  (5-8)  at  32“.  Ports  5  and  8  contain  Riber 
single  filament  cells  filled  with  Mg  and  Al,  both  dopants  in  the  nitrides  and  SiC.  Cells  6 
and  7  are  EPI  Sumo  dual  filament  cells  designed  to  reduce  cell  “spitting”  during  operation 
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00 
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50 
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2.75”CFF— 1 .75” 
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[E] 
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[H] 
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Fig.  6.  Schematic  of  growth  chamber  cell  panel. 
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and  increase  uniformity.  These  will  be  filled  with  Ga  and  In  for  growth  of  nitrides  and 
other  semiconductor  heterostructures.  A  three-line  gas  injector  and  cracker  is  installed  in 


cell  4.  The  injector  has  1200°C  temperature 
capability,  two  Baratron  flow  controllers,  and 
one  mass  flow  controller.  Each  hne  has  a 
manifold  to  switch  between  2  different  gasses: 
line  one  has  silacyclobutane  (SCB  SiC3H8)  and 
propane  (CjHg),  line  two  has  dilution  N2  or 
ammonia  (NH3),  and  line  three  has  H^.  The 
center  port  on  the  chamber  is  reserved  for  a 
nitrogen  plasma  source  for  growing  nitrides. 

2.5  Electronics 

A  gas  cabinet  (3)  to  the  left  of  the  cell 
panel  houses  the  cylinders  for  the  three-hne  gas 
injector  and  a  mechanical  pump  that  evacuates 
the  gas  lines  and  the  cryopump  exhaust.  The 
gas  cabinet  electronics,  power  supplies, 
pressure  gauges,  safety  actuators,  and 
temperature  controllers  are  all  located  in  racks 


(2)  next  to  the  system.  Two  control  computers 

(1)  are  used  to  integrate  the  electronics:  one  PC  operating  in  the  NextStep  Unix 
environment  using  a  Riber  program  called  Accessible  to  integrate  all  the  growth 
components  of  the  system;  a  second  PC  running  under  DOSAVindows  to  operate  the  RGA 


and  RHEED  gun. 


2.6  Analysis  Equipment 

In  situ  analysis  is  one  crucial  advantage  that  MBE  has  over  other  growth  processes. 
This  is  made  possible  by  the  UHV  conditions  inside  the  growth  chamber.  The 
environment  in  the  chamber  can  be  monitored  by  the  RGA  before,  during,  and  after  the 
run.  This  can  determine  the  quantities  of  reactive  species  in  the  chamber,  help  locate  the 
origin  of  grown-in  impurities,  and  evaluate  the  condition  of  the  pumping.  We  have 
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installed  a  Leybold  Inficon  HIOOM  quadrupole  with  a  Faraday  cup  for  high  pressure  gas 
analysis  and  an  electron  multipher  for  low  (<10  ®  Torr)  pressures.  For  surface-sensitive 
analysis  of  the  substrate,  we  have  a  Staib  Instruments  35kV  electron  gun  for  obtaining 
RHEED  patterns.  This  can  quantify  the  substrate  condition,  from  the  effectiveness  of  the 
pre-cleaning  and  outgassing  to  the  atomic  growth  characteristics  and  subsequent  annealing 
effectiveness.  RHEED  can  also  be  used  to  determine  growth  rates  and  surface 
reconstructions. 


3.  Epitaxy  of  SiC  -  Literature  Survey 


3.1  Heteroepitaxy 

Carbonization  is  an  effective  process  to  relax  the  large  lattice  mismatch  between  Si 
and  SiC.  There  are  two  different  carbonization  processes.  In  the  first  process,  the  Si 
substrate  is  annealed  to  high  temperature  (750  -  1100  “C)  and  then  hydrocarbon  gas  or 
carbon  (atoms  or  ions)  are  introduced  to  react  with  Si  surface.  Several  simple  hydrocarbon 
molecules,  such  as  CjHj,  C2H4,  CjHg  have  been  commonly  used  as  carbon  source.  The 
gas  pressure  during  carbonization  varied  from  10'^  to  10'^  Torr.  Some  researchers  found 
the  carbonized  films  were  polycrystaline  by  this  process  [3  -  5],  while  others  obtained  films 
of  good  quality  [6-9]. 

In  the  second  process,  carbon  containing  gas  is  first  introduced  into  the  chamber 
until  the  desired  pressure  is  reached,  and  then  the  substrate  temperature  is  raised  to  high 
temperatures  (750  -1100  “C)  at  a  slow  ramp  rate  (5  -  25  °C/min).  There  are  several  versions 
of  this  process,  with  variations  in  the  temperature  ramp  rate  at  different  temperature  stages 
and  the  incorporation  of  an  oxide  removal  step  [3].  In  general,  this  process  is  very  effective 
for  obtaining  single  crystal  3C-SiC  without  double-  positioning  twin  structures  and  pits 
[3,  4,  5].  This  process  probably  seals  off  the  outward  Si  diffusion  from  the  Si  substrate, 
which  is  believed  to  cause  surface  defects  [4,  5]. 

Essential  growth  of  SiC  is  used  to  further  grow  SiC  on  the  carbonized  film  with 
both  Si-  and  C-  containing  source.  The  substrate  temperature  ranges  from  750  to  1 100°C, 
the  total  pressure  is  usually  10  ’  to  10'^  torr  during  growth.  The  effect  of  the  flux  ratio 
between  Si  and  C  species  on  film  quality  and  stoichiometric  ratio  has  been  extensively 
studied  [5,10-13].  It  was  found  that  Jg/Jc  >  1  is  a  general  rule  for  growing  good  crystalline 
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3C-SiC  film  with  1 : 1  stoichiometric  ratio.  Atomic  layer  epitaxy  by  MBE  has  also  attracted  a 
lot  of  attention  [14-17].  It  was  found  that  the  surface  superstructures  during  an  alternating 
supply  of  C  source  and  Si  source  can  be  used  to  control  film  growth  to  atomic  level 
accuracy. 

Lattice  matched  growth 

The  lattice  constant  of  (1 1 1)  3C-SiC  nearly  matches  (<  0.1%)  that  of  the  c-plane  of 
6H-SiC.  6H-SiC  is  commercially  available  from  several  sources,  with  Cree  Research 
being  the  main  supplier.  6H-SiC  is  also  used  as  a  substrate  for  growing  good  quality  SiC 
films.  Some  results  show  that  the  epitaxially  grown  3C-SiC  (111)  on  6H-SiC  (1000)  at 
850  -1000  °C  have  double-positioning  twin  structure  [18-  19],  while  on  6H-Si  (0114) 
substrate,  the  3C-SiC(100)  epilayers  were  grown  without  twin  structures  at  temperatures  as 
low  as  850“C  [19]. 

Homoepitaxy  of  6H-SiC  was  also  investigated  and  growth  process  controlled  to  an 
atomic  level  was  obtained  by  monitoring  surface  superstructures  during  the  supply  of  Si 
and  C  atoms  [20-22].  The  grown  film  is  predominantly  6H-SiC  with  small  amount  of  3C- 
SiC  mainly  located  at  defect  sites.  These  defects  mostly  extend  from  the  substrate  into  the 
film. 


4.  Preliminary  Results 


In  this  section  we  describe  the  first  SiC  MBE  experiments  performed  at  the 
University  of  Cincinnati  and  discuss  our  preliminary  results. 

4.1  Experimental  Conditions 

The  epitaxial  growth  of  3C-SiC  was  carried  in  a  Riber  GSMBE  32  system.  The 

detailed  description  of  the  system  was  described  above.  Briefly,  the  epitaxial  growth 
process  starts  with  the  growth  chamber  being  evacuated  by  the  Ti  sublimation  pump  and 
cryogenic  pump.  After  60  hours  baking  at  250  °C,  a  base  pressure  of  3x10  "  torr  can  be 
routinely  obtained  without  filling  the  cryoshroud  with  liquid  nitrogen.  A  quadrupole  mass 
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analyzer  and  a  35  keV  reflection  high-energy  electron  diffraction  (RHEED)  system  are 
mounted  on  the  growth  chamber  for  gas  analysis  and  surface  crystalhnity  characterization, 
respectively.  A  high  temperature  gas  injector  (1200  “C)  with  three  gas  lines  is  employed  to 
introduce  propane  and  silacyclobutane  (SiCjHg  -  SCB)  into  the  growth  chamber.  The  gas 
flow  is  controlled  by  a  PID  controller.  The  substrate  is  introduced  through  the  load  lock 
chamber  which  was  pumped  by  a  Ti  subhmation  pump  and  an  ion  pump  to  <  5x10'^°  Torr. 
The  initial  outgassing  of  the  sample  is  also  performed  in  this  chamber.  The  sample  was 
transported  by  a  magnetically  coupled  transfer  rod  into  the  growth  chamber  and  locked  onto 
a  five-dimensional  manipulator  which  contains  a  high  temperature  (1200  °C),  high 
uniformity  oven.  The  heater  temperature  is  measured  by  a  W-Re  thermocouple  and 
controlled  by  a  PID  controller. 

Two  preliminary  growth  experiments  have  been  carried  out.  The  growth  conditions 
are  summarized  in  Table  1.  In  the  first  experiment  a  carbonized  SiC  SOI  (Si  On  Insulator) 
wafer  with  (111)  orientation  was  used  as  the  substrate  for  SiC  MBE  growth.  The 
carbonization  was  first  performed  by  rapid  thermal  CVD.  The  carbonized  SOI  sample  was 
cleaned  by  dipping  into  1%  HE  for  1  min  and  rinsing  in  DI  water  for  2  min.  The  sample 
was  then  introduced  into  the  load  lock  chamber,  where  it  was  outgassed  at  300  °C  for  12 
hours.  After  the  sample  was  transfered  into  growth  chamber  it  was  heated  to  1000  °C  until 

a  sharp  RHEED  pattern  with  6-fold  rotation  symmetry  was  observed,  indicating  a  clean  p- 

SiC  surface.  The  further  growth  on  the  carbonized  film  was  done  by  SCB  pyrolysis  with  a 
sample  temperature  of  1000  °C. 


# 

Substrate 

Carbonization  by  C 

SCB  Growth  by  MBE 

type 

time 

(min) 

pressure 

(torr) 

temp 

(”C) 

time 

(min) 

pressure 

(torr) 

temp 

(“C) 

C21 

Si(lll) 

(SOI) 

CVD 

760 

1235 

80 

1000 

Cl 

Si(lOO) 

on-axis 

MBE 

120 

1. 7-4.8 
xlO-® 

776 

CEl 

Si(lOO) 

on-axis 

MBE 

20 

1. 3-7.1 
xlO*^ 

800 

9 

1.8-3.7 

xlO® 

800 

CE2 

Si(lOO) 

on-axis 

MBE 

120 

1. 4-5.1 
xlO-® 

800 

95 

1. 8-3.5 
xlO® 

800 

Table  1.  Experimental  conditions  for  SiC  MBE  growth. 


The  second  experiment  utilized  a  Si(lOO)  wafer  as  substrate.  Both  carbonization 
and  essential  growth  were  carried  out  in  the  MBE  system.  Propane  was  used  for 
carbonization  and  SCB  for  essential  growth  on  the  carbonized  film.  The  temjjerature 
program  is  shown  in  Fig.  8. 


Temperature 


Fig.  8.  Temperature  program  of  SiC  growth  process  by  MBF. 


In  this  temperature  program  diagram,  there  are  three  parts:  (1)  the  Si(lOO)  substrate  was 
heated  to  400  °C  in  UHV  (10  ’°  -  lO'®  torr)  with  a  ramp  rate  of  20  °C/min;  (2)  the  supply  of 
the  propane  was  started  and  then  the  substrate  temperature  was  ramped  up  to  800  °C  at  a 
rate  of  20  °C/min;  (3)  the  essential  growth  of  3C-SiC  on  the  carbonized  layer  was  done  at 
800  °C  with  SCB  for  both  carbon  and  silicon  source. 

The  crystallographic  features  of  the  growing  surface  were  monitored  by  the 
RHEED  system.  The  RHEED  images  were  captured  by  a  CCD  camera  system,  and  a 
sophisticated  software  (RHEED- VISION)  was  used  to  grab  the  image  and  monitor  up  to 
four  diffraction  spot  intensities  during  the  growth.  The  thickness  and  composition  of  the 
grown  layers  were  measured  by  secondary  ion  mass  spectrometry  (SIMS)  using  Cs 
bombardment  with  positive  ion  detection.  Elements  monitored  were  C,  Si,  O,  N  and  B. 
Relative  sensitivity  factors  derived  from  a  SiC  standard  were  used  to  convert  ion  counts  to 
concentrations.  The  thickness  of  SiC  layer  was  estimated  by  finding  the  position  in  the 
carbon  depth  profile  at  which  atomic  carbon  concentration  is  50%  of  its  maximum.  The 
SCB  growth  rate  was  calculated  by: 


Growth  Kate  = 


ThichTceeitotat)  -  Thickne&e{caH?onizeet) 
Time{5C^ 


(1) 
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Propane  was  supplied  by  Matheson  Gas  Products,  Inc.  with  a  purity  of  99.97%. 
SCB  was  provided  by  Dow  Coming.  SCB  is  a  liquid  at  room  temperature  with  a  vapor 
pressure  of  400  Torr.  Both  propane  and  SCB  vapor  were  used  without  further 
purification.  Before  gas  or  vapor  were  introduced  into  the  growth  chamber,  the  cryoshroud 
was  filled  with  liquid  nitrogen,  the  Ti-sublimation  pump  and  the  RGA  were  turned  off  to 
obtain  a  cleaner  growth  environment,  all  cells  and  gas  injector  were  set  to  200  °C  to  avoid 
their  contamination  by  C3Hg  or  SCB.  Both  propane  and  SCB  were  introduced  towards  the 
substrate  through  the  gas  injector.  The  flow  rate  of  propane  during  carbonization  was 
equivalent  to  its  partial  pressure  from  1.5  -  5.5  xl0‘®  torr,  the  flow  rate  of  SCB  during 
growth  corresponded  to  its  partial  pressure  from  6. 1  to  16  xlO  ''  torr.  The  sample  was  not 
rotated  during  the  growth  in  order  to  be  able  to  monitor  the  RHEED  pattern. 

4.2  Results  and  Discussion 

Fig.  9  shows  a  typical  RHEED  pattern  of  the  3C-SiC  sample  (C21)  obtained  by 
RTCVD  carbonization.  This  pattern  has  twin  diffraction  spots  which  may  be  the 
superposition  of  two  diffraction  patterns  corresponding  to  two  domains  which  are  180° 
rotated  from  each  other.  The  schematic  of  this  interpretation  is  shown  in  Fig.  10. 


Fig.  9.  Typical  RHEED  pattern  of  3C-SiC  film  by  carbonization  of  SOI  with  RTCVD. 


Therefore,  the  RTCVD  carbonized  film  appears  to  have  double-positioning  twin  structure. 
The  possible  reason  of  this  twin  structure  is  that  the  Si  carbonization  is  driven  by  reaction 
on  terraces,  with  different  terraces  causing  the  difference  in  stacking  order. 

After  the  carbonization  by  CVD  as  shown  in  Fig.  9,  essential  growth  of  SiC  was 
performed  in  MBE  growth  chamber  with  SCB  as  both  Si  and  C  source.  During  the  growth, 
some  fluctuation  of  intensity  was  observed.  No  regular  intensity  oscillation  was  observed, 
probably  indicating  that  the  growth  was  not  in  a  layer-by-layer  mode  or  the  growth  rate  is 
too  small.  However,  the  3C-SiC  RHEED  pattern  was  very  clear  during  the  entire  growth 
period.  Fig.  11  shows  a  RHEED  image  after  80  min  growth  at  a  SCB  partial  pressure  of 
3.0x10'^  Torr.  Compared  with  the  RHEED  pattern  of  Fig.  9,  the  twin  spots  in  Fig.  11  are 
much  weaker,  which  probably  indicates  the  double-positioning  twin  stmcture  is  much 
reduced  after  essential  growth  by  MBE. 


O 

o 


o 

o 


RHEED  from  structure  A 


RHEED  from  stmcture  B 


S  8  ^  S  8  RHEED  from  twin  stmcture:  A-i-B 

8  8^88 


Fig.  10.  Interpretation  of  the  RHEED  pattern  in  Fig.  9. 
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Fig.  11.  RHEED  pattern  of  SiC  film  grown  by  MBE  with  SCB  on  SOI  sample  carbonized 
with  propane  by  CVD  (C21). 

Carbonization  and  essential  growth  were  also  performed  sequentially  in  MBE 
system.  The  substrate  was  a  2  inch  on-axis  Si(lOO)  wafer.  Fig.  12  shows  the  RHEED 
patterns  observed  during  SiC  growth  of  sample  CE2.  Fig.  12  (a)  is  the  RHEED  pattern  of 
the  Si(lOO)  surface  after  annealing  at  400  “C  for  half  a  hour.  The  clear  streaky  pattern 
indicates  a  clean  Si(lOO)  surface.  After  introducing  propane  into  the  growth  chamber,  the 
temperature  of  substrate  was  ramped  at  a  rate  of  20  ‘’C/min.  During  this  temperature 
ramping  process,  the  RHEED  pattern  did  not  change  until  760  "C.  As  the  carbonization 
proceeded,  the  RHEED  pattern  associated  with  3C-SiC  became  stronger  while  that  of 
Si(lOO)  became  weaker  and  finally  disappeared.  Fig.  12  (b)  is  the  3C-SiC  RHEED  pattern 
obtained  after  2  hours  carbonization  of  Si(lOO).  No  twin  spots  were  observed  in  the  MBE 
carbonized  film  as  were  seen  in  the  carbonized  film  by  RTCVD.  This  observation  indicates 
that  carbonized  film  by  MBE  has  fewer  double-boundary  defects  than  that  obtained  by 
CVD.  The  further  growth  of  3C-SiC  was  done  by  introducing  SCB  at  a  sample 
temperature  of  800  °C.  Fig.  12  (c)  shows  the  RHEED  pattern  after  5  min  growth  by  SCB. 
The  pattern  looks  more  streaky  than  that  of  carbonized  film  and  indicates  the  surface  is 
smoother.  As  growth  proceeded,  the  rings  started  to  appear  as  shown  in  Fig.  12  (d), 
which  indicates  a  poly-crystal  SiC  film  was  being  formed.  The  possible  reason  is  that  the 
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strain  due  to  the  large  lattice  mismatch  between  Si  and  SiC  was  not  completely  released  by 
carbonization. 


Fig.  12.  RHEED  patterns  observed  during  SiC  growth  (CE2):  (a)  clean  Si(lOO)  surface  at 
400‘’C;  (b)  3C-SiC  grown  by  CjHg  carbonization  for  two  hours  at  Pc3H8  =  -3x10'^ 
torr;  (c)  3C-SiC  grown  on  the  carbonized  layer  by  SCB  for  5  min  at  Pscb  “  " 

2.9x10'^  torr;  (d)  3C-SiC  grown  on  the  carbonized  layer  by  SCB  for  95  min  at 
PgcB  =  1.8  -  3.5xlO‘Wr. 

The  film  composition  and  thickness  of  the  MBE-grown  SiC  layers  were  measured 
with  secondary  ion  mass  spectrometry  (SIMS).  Fig.  13  shows  the  depth  profile  of  the  3C- 
SiC  film  (sample  Cl)  grown  by  MBE  carbonization  of  Si(lOO)  at  776°C  for  two  hours.lt  is 
apparent  that  only  a  very  thin  Si  layer  (probably  ~  30-40  A)  has  been  converted  at  this 
temperature.  It  is  interesting  to  point  out,  however,  that  even  this  very  thin  carbonized 
layer  produced  a  very  good  (similar  to  Fig.  12b)  RHEED  pattern  indicative  of  crystalline 
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Sic.  Fig.  14  contains  the  depth  profile  of  the  3C-SiC  film  (CEl)  grown  at  800°C  using 
carbonization  for  20  min  followed  by  SCB  growth  for  9  min.  The  thiekness  of  the  SiC 
film  is  approximately  95-100  A  judging  by  the  depth  at  which  the  carbon  concentration 
reaches  the  50%  point.  Fig.  15  shows  the  depth  profile  of  the  SiC  film  (CE2)  also  grown 
at  800°C,  but  for  significantly  longer  times:  carbonization  for  120  min  and  essential  growth 
by  SCB  for  100  min.  In  this  sample  we  find  a  SiC  film  thickness  of  around  600  A.  Using 
the  film  thickness  obtained  from  the  SIMS  depth  profiles  and  the  SCB  growth  time,  the 
SiC  growth  rate  by  SCB  at  800  °C  is  estimated  to  be  around  0.1  A/s.  This  low  growth  rate 
could  be  due  to  a  combination  of  effects:  low  growth  temperature,  low  growth  pressure, 
and  possibly  the  fluctuation  of  the  gas  flow  during  gas  introduction.  In  Figs.  14  and  15,  a 
long  carbon  tail  into  the  Si  substrate  is  observed.  This  could  be  caused  by  several  effects: 
(a)  absence  of  an  abrupt  interface  between  SiC  and  Si  substrate,  due  to  carbon  atoms 
diffusing  into  the  Si  during  growth;  (b)  non-uniform  film  thickness  resulting  in  certain 
locations  where  the  SiC  layer  is  removed  sooner  during  SIMS  profiling.  The  SIMS  data 
also  show  that  impurities  N,  O  and  B  concentration  are  less  than  0.1%  in  the  SiC  film. 

5.  Summary  and  Future  Work 

In  this  report  we  described  the  Riber  MBE  32  system  recently  installed  in  the 
Nanoelectronics  Laboratory  at  University  of  Cincinnati.  Some  preliminary  results  of 
heteroepitaxy  of  SiC  with  this  MBE  system  were  presented.  We  have  clearly  demonstrated 
that  SiC  growth  with  SCB  can  be  accomplished  by  the  MBE  technique.  At  800  °C,  the  SiC 

growth  rate  obtained  with  SCB  under  the  first  set  of  conditions  utilized  is  quite  low  (0. 1 

0 

A/s).  We  are  continuing  with  experiments  designed  to  determine  the  effect  of  growth 
temperature,  pressure  and  flow  rate  on  the  growth  process:  growth  rate  and  film  quality  of 
SiC.  We  plan  to  also  investigate  the  use  of  other  gas  precursors  for  SiC  growth  and  the  in- 
situ  sequential  growth  of  SiC  and  GaN.  In  addition  to  erystallinity  and  surface 
morphology,  we  will  also  investigate  the  electrical  and  optical  properties  of  the  MBE- 
grown  SiC  and  GaN  films. 
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ATOMIC  CONCENTRATION  (%) 


DEPTH  (Angstroms) 


Fig.  13.  SIMS  depth  profile  of  SiC  fihn  (sample  Cl)  grown  by  two  hours  carbonization 
with  CjHg.  (a)  linear  scale;  (b)  semi-log  scale. 
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Fig.  14.  SIMS  depth  profile  of  SiC  film  (sample  CEl)  grown  by  20  min  carbonization  and 
9  min  essential  growth  by  SCB.  The  SiC  film  thickness  is  95  A.  (a)  linear  scale;  (b)  semi¬ 
log  scale. 
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Fig.  15.  SIMS  depth  profile  of  SiC  film  (#  CE2)  grown  by  120  min  carbonization  and  100 
min  essential  grown  by  SCB.  The  film  thickness  is  600  A.  (a)  linear  scale;  (b) 


semi-log  scale. 
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ABSTRACT 

Restoration  and  superresolution  processing  of  images  collected  from  various  sensors  deployed  in 
multispectral  seeker  environments  often  become  necessary  to  enhance  image  resolution  in  order  to  facilitate  better 
false  target  rejection,  improved  automatic  target  recognition  and  aimpoint  selection.  Due  to  the  critical  importance 
of  this  technology  to  several  Air  Force  missions  and  its  relevance  to  diverse  on-going  projects  in  several  Air  Force 
laboratories,  we  are  conducting  detailed  studies  on  this  topic.  Two  major  outcomes  from  these  studies  have  been 
the  recognition  of  the  importance  of  optimally  tailored  restoration  and  superresolution  algorithms  for  the 
individual  sensor  type  and  operating  conditions,  and  the  feasibility  of  using  iterative  and  noniterative  processing 
techniques  in  an  intelligent  tailoring  of  these  algorithms.  Use  of  these  techniques  however  can  result  in  different 
performance  levels  and  also  can  bring  specific  advantages  and  disadvantages  to  the  overall  restoration  and 
superresolution  function  (and  hence  to  the  surveillance  and  smart  munition  guidance  objectives).  The  principal 
objective  of  the  research  reported  here  is  to  give  a  qualitative  comparison  of  the  performance  expected  from  an 
iterative  procedure  based  on  Bayesian  estimation  methods  with  that  resulting  from  a  class  of  noniterative 
restoration  methods.  More  quantitative  performance  evaluations  directed  to  an  explicit  demonstration  of  the 
spectrum  extrapolation  in  both  one-dimensional  and  two-dimensional  signals  from  an  iterative  implementation  of  a 
Maximum  Likelihood  (ML)  algorithm  are  also  presented  .A  modification  of  the  algorithm  to  facilitate 
simultaneous  estimation  of  point  spread  function  of  sensor  and  resolution  enhancement  of  input  image  is  outlined 
and  an  illustration  of  its  performance  in  processing  a  set  of  passiv  e  inillimeter-wave  (MMW)  images  obtained  from 
a  95  GHz  1-foot  diameter  aperture  radiometer  is  given. 
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PERFORMANCE  OF  ITERATIVE  AND  NONITERATIVE  SCHEMES 
FOR  IMAGE  RESTORATION  AND  SUPERRESOLUTION  PROCESSING 
IN  MULTISPECTRAL  SEEKER  ENVIRONMENTS 


Malur  K.  Sundareshan 

1.  INTRODUCTION 

A  significant  problem  that  affects  the  successful  realization  of  the  goals  of  many  tactical  missions  is  the 
poor  resolution  of  images  collected  from  the  sensors  used  to  assist  surveillance  -and  guidance  operations.  The 
problem  is  particularly  prevalent  in  autonomous  missile  guidance  applications  where  diverse  mission  requirements, 
such  as  reliable  target  detection,  classification,  interleaved  acquisition,  track  and  engage  modes,  and  precision  kill, 
critically  depend  on  the  quality  of  data  collected  from  the  sensors  deployed  in  missile  seekers.  While  it  is  true  that 
deployment  within  a  common  aperture  package  of  complementary  sensors  operating  over  different  frequency 
ranges  would  enhance  the  detection,  classification  and  track  maintenance  performance  of  missile  seekers  (in 
addition  to  providing  increased  fault  tolerance  and  greater  immunity  to  countermeasures),  such  multispectral 
environments  also  accentuate  other  considerations,  such  as  sensor  fusion  requirements,  which  in  turn  demand  high 
resolution  sensor  data. 

The  problem  of  poor  resolution  in  imaging  sensors  stems  mainly  from  deployable  antenna  size  limitations 
(which  preclude  simply  increasing  the  physical  aperture  of  sensors  to  gain  high  image  resolution)  and  the 
consequent  diffraction  limits  on  the  achievable  resolution.  It  may  be  noted  that  the  wavelength  of  a  synthetic 
aperture  radar  (SAR)  operating  at  IGHz  is  about  1  inch  long  and  one  needs  an  antenna  as  big  as  40  ft  wide  in 
order  to  achieve  a  resolution  requirement  of  being  able  to  distinguish  points  in  a  scene  separated  by  about  1  meter 
at  a  distance  of  1  Km  [1].  Passive  millimeter-wave  (PMMW)  sensing  offers  superior  adverse  weather  capabilities 
(over  infra-red  (IR)  sensors,  for  instance)  due  to  easy  penetration  through  fog,  dust,  smoke,  etc.  However,  PMMW 
image  acquisition  sensors  suffer  from  poor  angular  resolution.  It  is  well  documented  that  the  angular  resolution 
achievable  by  a  94  GHz  system  with  a  1  ft  diameter  antenna  is  only  about  10  mrad,  which  translates  into  a  spatial 
resolution  of  about  10  meters  at  a  distance  of  1  Km.  Some  recent  studies  [2]  have  also  established  that  for  ensuring 
reasonably  adequate  angular  resolution  (typically  of  the  order  of  4  mrad),  a  94  GHz  PMMW  imaging  system  with  a 
sensor  depression  angle  of  60°  -  80°  needs  to  be  confined  to  very  low  operational  altitudes  (of  the  order  of  75-100 
meters)  which  puts  inordinate  demands  on  the  guidance  schemes  to  facilitate  such  requirements.  Similar  resolution 
limitations  and  the  consequent  requirements  on  operational  conditions  (some  of  which  may  be  clearly  impossible  to 
satisfy  for  tactical  missions  with  reliability  and  survivability  constraints)  exist  for  the  other  types  of  sensing 
modalities  as  well. 
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Typical  seeker  antenna  patterns  are  of  a  “low-pass”  filtering  nature  due  to  the  finite  size  of  the  antenna  or 
lens  that  makes  up  the  imaging  system  and  the  consequent  imposition  of  the  underlying  diffraction  limits.  Hence 
the  image  recorded  at  the  output  of  the  imaging  system  is  a  low-pass  filtered  version  of  the  original  scene,  The 
portions  of  the  scene  that  are  lost  by  the  imaging  system  are  the  fine  details  (high  frequency  spectral  components) 
that  accurately  describe  the  objects  in  the  scene,  which  also  are  critical  for  reliable  detection  and  classification  of 
targets  of  interest  in  the  scene.  Hence  some  form  of  image  processing  to  restore  the  details  and  improve  the 
resolution  of  the  image  will  invariably  be  needed.  Traditional  image  restoration  procedures  (based  on 
deconvolution  and  inverse  filtering  approaches)  attempt  mainly  at  reconstruction  of  the  passband  and  possibly 
elimination  of  effects  of  additive  noise  components.  These  hence  have  only  limited  resolution  enhancement 
capabilities.  Greater  resolution  improvements  can  only  be  achieved  through  a  class  of  more  sophisticated 
algorithms,  called  superresolution  algorithms,  which  provide  not  only  passband  resolution  but  also  some  degree  of 
spectral  extrapolation,  thus  enabling  to  restore  the  high  frequency  spatial  amplitude  variations  relating  to  the 
spatial  resolution  of  the  sensor  and  lost  through  the  filtering  effects  of  the  seeker  antenna  pattern.  A  tactful 
utilization  of  the  imaging  instrument’s  characteristics  and  any  a  priori  knowledge  of  the  features  of  the  target 
together  with  an  appropriately  crafted  nonlinear  processing  scheme  is  what  gives  the  capability  to  these  algorithms 
for  superresolving  the  input  image  by  extrapolating  beyond  the  passband  range  and  thus  extending  the  image 
bandwidth  beyond  the  diffraction  liimt  of  the  imaging  sensor. 

For  application  in  missile  seeker  environments,  it  must  be  emphasized  that  superresolution  is  a  post¬ 
processing  operation  applied  to  the  acquired  imagery  and  consequently  is  much  less  expensive  compared  to 
improving  the  imaging  system  for  desired  resolution.  As  an  example,  it  may  be  noted  that  for  visual  imagery 
acquired  from  space-borne  platforms,  some  studies  indicate  that  the  cost  of  camera  payload  increases  as  the  inverse 
2.3  power  of  the  resolution.  Hence  a  possible  two-fold  improvement  in  resolution  by  superresolution  processing  in 
this  application  roughly  translates  into  a  reduction  in  the  cost  of  the  sensor  by  more  than  5  times.  Similar  relations 
also  exist  for  sensors  operating  in  the  other  spectral  ranges  (due  to  the  relation  between  resolution  and  antenna 
size),  confimung  the  cost  effectiveness  of  employing  superresolution  algorithms.  The  principal  goal  of 
superresolution  processing  in  multispectral  seekers  is  hence  to  obtain  an  image  of  a  target  of  interest  (such  as  a 

mobile  missile-launcher  or  a  tank)  via  post-processing  that  is  equivalent  to  one  acquired  through  a  more  expensive 
larger  aperture  sensor. 

Most  of  the  recent  analytical  work  in  the  development  of  image  restoration  and  superresolution  algorithms 
has  been  motivated  by  applications  in  Radioastronomy  and  Medical  Imaging.  While  this  work  has  given  rise  to 
some  mathematically  elegant  approaches  and  powerful  algorithms,  a  certain  degree  of  care  should  be  exercised  in 
adapting  these  approaches  and  algorithms  to  the  missile  seeker  environment.  This  is  due  to  the  convergence 
problems  often  encountered  by  iterative  schemes  and  the  specific  statistical  models  representing  the  scenarios 
facilitating  their  development.  For  example,  a  slowly  converging  algorithm  that  ultimately  guarantees  the  best 
resolurion  in  the  processed  image  may  pose  no  implementational  problems  in  Radioastronomy;  however,  it  could 
be  entirely  unrealistic  for  implementation  in  an  autonomous  unmanned  tactical  system  that  must  operate  fast 
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enough  to  track  target  motion.  Hence  a  careful  tailoring  of  the  processing  algorithm  for  each  sensor  supporting  the 
multispectral  seeker  is  of  critical  importance  in  order  to  realize  the  possible  performance  benefits  from 
superresolution  processing  which  include  better  false  target  rejection,  improved  automatic  target  recognition  and 
aimpoint  selection. 

The  high  degree  of  importance  this  topic  has  to  present  and  future  Air  Force  missions  is  clearly  evident. 
Equally  evident  is  the  fact  that  research  on  this  topic  has  an  immediate  application  to  a  number  of  on-going 
programs  in  various  Air  Force  laboratories.  The  research  described  in  this  report  is  an  extension  of  the 
investigations  that  were  conducted  under  a  summer  faculty  visit  to  the  Wright  laboratory  Armament  Directorate. 
Two  principal  outcomes  from  these  investigations  [3]  have  been  the  recognition  of  the  importance  of  optimally 
tailored  restoration  and  superresolution  algorithms  for  the  individual  sensor  type  and  operating  conditions,  and  the 
feasibility  of  using  iterative  and  noniterative  processing  techniques  in  the  tailoring  of  these  algorithms.  Use  of 
these  techniques  however,  can  result  in  different  performance  levels  and  further  entail  specific  advantages  and 
disadvantages.  In  this  report  we  shall  give  a  qualitative  comparison  of  the  performance  expected  from  an  iterative 
procedure  based  on  Bayesian  estimation  methods  with  that  resulting  from  a  class  of  noniterative  restoration 
methods.  Quantitative  performance  evaluations  directed  to  demonstrating  the  spectrum  extrapolation  in  both  one¬ 
dimensional  and  two-dimensional  signals  resulting  from  an  iterative  implementation  of  a  Maximum  Likelihood 
(ML)  algorithm  will  also  be  presented.  To  counter  the  inaccuracies  in  the  modeling  of  the  Point  Spread  Function 
(PSF)  of  the  sensor,  a  modification  of  this  algorithm  to  facilitate  simultaneous  estimation  of  PSF  parameters  and 
resolution  enhancement  of  input  image  is  outlined  and  an  illustration  of  its  performance  in  processing  a  set  of 
passive  millimeter-wave  (MMW)  images  obtained  from  a  95  GHz  1-foot  diameter  aperture  radiometer  is  given. 

2.  MATHEMATICAL  REPRESENTATION  OF  IMAGE  RESTORATION  AND 
SUPERRESOLUTION  PROBLEMS 

In  this  section  we  shall  briefly  describe  the  technical  problems  underlying  the  restoration  and 
superresolution  processing  of  sensor  data  in  terms  of  reconstructing  the  spectral  components  of  the  signals  being 
processed.  A  brief  outline  of  the  information  available  for  developing  specific  algorithms  in  order  to  solve  these 
problems  will  also  be  given. 

2.1  Image  Formation  Process  (Observation  Model) 

Every'  systematic  image  processing  study  (including  image  restoration  and  superresolution  processing) 
will  start  with  an  appropriate  mathematical  model  characterizing  the  process  of  image  formation  by  the  sensor 
employed,  which  is  termed  an  “observation  model”.  Irrespective  of  the  type  of  sensor  actually  used,  a  commonly 
used  observation  model  takes  the  form 


g  =  s{Hf)  +  n 


(1) 
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where  /  denotes  the  object  being  sensed,  g  its  image  and  H  denotes  the  operator  that  models  the  filtering 
process  including  any  associated  degradations  (such  as  due  to  small  aperture  size  of  the  sensor)  and  blur 
phenomena  (caused  by  atmospheric  effects,  motion  of  the  object  or  the  sensor,  or  out  of  focus  operations,  etc.),  n 
denotes  the  additive  random  noise  in  the  sensing  process,  which  includes  both  the  receiver  noise  and  anv 
quantization  noise.  The  response  of  the  image  recording  sensor  to  the  intensity  of  input  signal  (light,  radar,  etc.)  is 
represented  by  the  memoiyless  mapping  5(.) ,  which  is  in  general  nonlinear. 

For  the  sake  of  precision,  let  us  consider  the  image  to  be  obtained  from  an  incoherent  sensor.  We  will  also 
assume  that  the  image  to  be  processed  consists  of  M  x  M  equally  spaced  grey  level  pixels,  obtained  through  a 
sampling  of  the  image  field  at  a  rate  that  satisfies  the  Nyquist  criterion.  Furthermore,  for  mathematical  tractability 
we  will  make  the  commonly  used  assumptions,  which  include:  (i)  space-invariant  imaging  process,  (ii)  ignore  the 
nonlinear  effects  of  the  sensor,  and  (iii)  approximate  the  noise  process  by  a  zero-mean  white  Gaussian  random 
field  which  is  independent  of  the  object.  With  these  assumptions.  Equation  (1)  can  be  rewritten  to  relate  the  image 
intensity  value  at  pixel  (i,j)  to  the  object  pixel  values  as 

g(ij)=  +  =  (2) 

where  h(ij)  denotes  the  point  spread  function  (PSF)  of  the  sensor. 

For  an  image  of  size  M  x  M ,  Equation  (2)  corresponds  to  a  set  of  scalar  equations  specifying  the 
formation  of  each  image  pixel.  For  a  further  simplified  representation  [4,5],  by  a  lexicographical  ordering  of  the 
signals  g,f  and  n  ,  one  can  rewrite  Equation  (2)  as  resulting  from  a  convolution  of  two  one-dimensional  vectors 

h  =  [h{\)M2),...MN)J  and  /  =  [/(l),/(2),...,/(iV)]"  as 

N 

g(i)  =  h(i)  O  /(/•)  +  nil)  =  2  h(i  -  j)f{j)  +  nil) ,  /  =  1,2,  ...N.  (3) 

;=i 

where  N  =  .  More  compactly,  Equation  (3)  can  be  rewritten  as  the  vector  equation 

g  =  Hf+n  (4) 

where  g ,  f ,  n  aie  vectors  of  dimension  N ,  and  H  denotes  the  PSF  block  matrix  whose  elements  can  be 

constructed  [4,5]  from  the  PSF  samples  {/t(l),/;(2),...,/2(A/')} .  It  should  be  noted  that  Equations  (3)  and  (4) 
represent  space-domain  models  and  are  equivalent  to  the  frequency-domain  model 
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G{a))  =  H{co)F{o})  +  N{co) 


(5) 


where  CO  is  the  discrete  frequency  variable  and  G(<y) ,  F(co)  ,  H(co) ,  and  N(o))  are  DFT’s  of  the  N  -point 
sequences  g(i) ,  /{}),  h{i) ,  and  n{i)  respectively. 


2.2  Image  Restoration  and  Superresolution  Problems 

For  application  in  missile  seeker  enviroiunents.  Equation  (3)  describes  the  process  of  image  formation 
when  an  unknown  object  with  radiance  distribution  {  f  (/)  }  is  imaged  through  a  sensor  with  a  shift-invariant  PSF 
{h{i)}.  As  noted  earlier,  practical  seeker  anteima  patterns  have  a  low-pass  spectral  characteristic  and 
consequently  the  image  obtained  is  a  low-pass  filtered  version  of  the  object  (or  scene)  being  imaged.  The  problem 
of  interest  is  then  to  recover  the  object,  i.e.  {/(()  },  by  solving  Equation  (3)  (or  Equation  (5)).  However,  since  the 
noise  sequence  { «(/)  }  will  not  be  known  exactly,  one  will  not  be  able  to  solve  Equation  (3)  for  { /(/) }  exactly 
even  when  {h{i) },  the  PSF  of  the  seeker  antenna,  is  exactly  known.  One  can  only  hope  to  obtain  an  estimate 

A 

{ /  (0  }  which  is  in  some  sense  close  to  the  original  {  /(/) },  based  on  some  reasonable  assumptions  on  the  noise 
process  { n{i)  }.  If  a  distance  measure  J{g.,f)  between  g  and  /  is  used  as  a  norm  to  measure  the  closeness  of 
the  estimate,  the  problem  of  interest  can  be  specified  concisely  as  obtaining  the  estimate 

=  /(lX/(2)v,./(A^)  suchthat 


/  =  arg  min ^  j{gj)  =  arg  min ^  j[g{i)  -  hQ  -  (6) 

An  examination  of  the  frequency  spectra  of  the  object  and  the  image  is  useful  to  see  clearly  the  effect  of  the  seeker 
antenna.  Let  us  assume  that  the  object  is  space-limited  with  spatial  extent  s  and,  without  any  loss  of  generality. 


assume  that  is  nonzero  only  on  the  interval 


.  This  implies  that  the  spectrum  F{co')  has  infinite 


extent,  i.e.  the  object  has  infinite  bandwidth,  and  in  the  discrete  frequency  domain,  the  spectral  components  in 

co^ 

F{co)  extend  all  the  way  to  — ,  the  folding  frequency,  as  shown  in  Fig.  la. 

The  image  spectrum  G(ty)  is  a  low-pass  filtered  version  of  F{(o)  with  the  cut-off  frequency  co^ 
determined  by  the  diffraction  limit  of  the  sensor.  Assuming  an  ideal  low-pass  filter  characteristic,  the  shape  of 
G(co)  will  be  as  shown  in  Fig.  lb  with  the  spectral  components  removed  in  the  interval  (O^  <CO<coj2.  The 
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degradations  in  the  image  are  hence  caused  by  three  factors:  (1)  spectral  mixing  within  the  passband  0  <  0)  <  co^ 
due  to  the  convolution  with  the  PSF  of  the  seeker  antenna;  (2)  spectral  attenuation  caused  by  removal  of  spectral 


Fig.  la.  Low  pass  filtering  by  seeker  antenna 


Fig.  lb.  Image  spectrum  resulting  from  low  pass  filtering  of  object  spectrum, 
components  outside  the  passband;  and  (3)  corruption  of  the  passband  due  to  the  additive  noise  process  {/?(/  ) }. 

Perfect  image  restoration  requires  compensation  for  all  three  factors  cited  above. 

Traditional  image  restoration  methods  attempt  mainly  at  passband  reconstruction,  i.e.  to  eliminate  the 
degradations  caused  by  the  first  and  the  third  factors  above.  This  is  achieved  by  various  deconvolution  and  noise 
filtering  approaches  [5,6].  The  goal  of  superresolution  is  to  correct  for  all  three  of  the  above  factors,  and  hence  in 
addition  to  restoration  of  spectral  components  in  the  passband,  extrapolation  of  the  spectrum  beyond  co^  is  to  be 
achieved.  Evidently,  the  ideal  of  restoring  all  lost  spectral  components  may  be  too  ambitious  and  hence  realistically 
one  may  have  to  be  content  with  some  spectral  extrapolation  which  facilitates  recovering  the  spectrum  in  the 
interval  Q)^  <  CO  <  0)^ ,  where  CO^  <  coj2  is  an  extended  frequency  limit.  It  is  of  interest  to  note  that  even  if  this 

limited  goal  is  attained,  then  the  effective  cut-off  frequency  is  moved  from  CO^  to  CO^  and  hence  the  processed 
image  appears  as  the  image  acquired  from  a  higher  resolution  (more  expensive,  larger  aperture)  sensor  with  this 
larger  cut-off  frequency.  It  should  also  be  noted  that  since  generation  of  new  frequency  components  not  present  in 
the  original  image  is  attempted,  some  form  of  nonlinear  processing  becomes  essential,  since  linear  signal 
processing  methods  can  not  produce  frequencies  not  present  in  the  input  signal. 
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To  illustrate  the  complexity  in  solving  problems  of  this  type,  consider  the  simplest  case  when  no  spectral 
extrapolation  is  needed,  the  PSF  { h(i)  }  of  the  sensor  is  assumed  to  be  known  and  the  noise  «,  is  ignored.  This  is 
the  classical  deconvolution  problem  [5,6]  of  solving  the  vector  equation 

g=Hf  (7) 

for  f  given  g  and  H ,  and  a  solution  can  be  attempted  in  the  form  of  an  “inverse  filter”  given  by 

f  =  (8) 

Unfortunately,  there  are  several  problems  with  this  approach.  The  system  of  equations  given  by  equation  (7)  is 
often  underdetermined  which  results  in  being  not  defined.  Even  if  //”'  (or  a  generalized  inverse  of  H)  can 

A 

be  computed,  the  estimate  f  obtained  may  be  worthless  due  to  the  presence  of  noise  that  was  ignored.  Observe 
from  the  image  formation  model  given  by  Equation  (4),  when  the  presence  of  noise  is  accounted  for. 

It  is  now  clear  that  H  being  a  low-pass  filter,  //  '  corresponds  to  a  high-pass  filter  and  hence  the  noise  is  greatly 
amplified  in  the  solution  estimate.  A  difficulty  of  a  related  nature  which  also  can  make  the  solution  given  by 
Equation  (8)  of  limited  value  is  that  an  exaa  knowledge  of  H  is  needed  for  computing  the  solution  and  even  a 
small  uncertainly  in  the  parameters  describing  the  sensor  PSF  can  result  in  a  very  large  discrepancy  in  the  solution. 
In  other  words,  the  solution  given  by  Equation  (8)  is  not  “robust”  enough  to  tolerate  these  nonideal  conditions  that 
may  exist  in  practice  making  the  estimate  obtained  useless.  Finally,  the  inverse  filter  solution  is  a  linear  operation 
and  provides  no  extrapolation  of  spectrum  thus  lacking  any  capability  for  superresolving.  It  will  be  seen  later  that 
the  drawbacks  of  this  solution  procedure  stem  from  the  fact  that  no  use  of  any  a  priori  knowledge  about  the  object 
being  restored  is  made. 

The  idea  of  recreating  the  spectral  components  that  are  removed  by  the  imaging  process  and  hence  are  not 
present  in  the  image  available  for  processing  may  pose  some  conceptual  difficulties,  which  may  lead  one  to  suspect 
whether  superresolution  is  indeed  possible.  Fortunately  there  exist  sound  mathematical  arguments  confirming  the 
possibility  of  spectral  extrapolation.  The  primary  justification  comes  from  the  Analytic  Continuation  Theorem  and 
the  property  that  when  an  object  has  finite  spatial  extent  its  frequency  spectrum  is  analytic  [7].  Due  to  the  property 
that  a  finite  segment  of  any  analytic  fimction  in  principle  determines  the  whole  function  uniquely,  it  can  be  readily 
proved  that  knowledge  of  the  passband  spectrum  of  the  object  allows  a  unique  continuation  of  the  spectrum  beyond 
the  diSraction  limit  imposed  by  the  imaging  system.  It  must  be  emphasized  that  the  limited  spatial  extent  of  the 
object  is  critical  in  providing  this  capability  for  extrapolation  in  the  frequency  domain. 

2.3  Use  of  a  priori  Knowledge  in  Solution  Process 

As  noted  in  the  last  section,  due  to  the  ill-posed  nature  of  the  inverse  filtering  problem  underlying  image 
restoration  and  superresolution  objectives,  it  is  necessary  to  have  some  a  priori  information  about  the  ideal 
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solution,  i.e.  the  object  /  being  restored  from  its  image  g .  In  algorithm  development,  this  information  is  used  in 

defimng  appropnate  constraints  on  the  solution  and/or  in  defining  a  criterion  for  the  “goodness”  of  the  solution. 
How  to  utilize  this  information  is  at  the  heart  of  a  well-tailored  superresolution  algorithm. 

The  specific  a  priori  knowledge  that  can  be  used  evidently  depends  on  the  specific  application  For 
applications  in  astronomy,  it  could  come  in  the  form  of  some  known  facts  about  the  spectral  differences  of  the 
objects  one  is  looking  for  (for  instance,  a  double  star  as  opposed  to  a  star  cluster).  In  medical  imaging  and  in 
military  applications,  it  could  come  from  the  geometrical  features  of  the  object  (target  shape,  for  instance).  For 
radar  and  MMW  imagery,  one  could  use  the  fimdamental  knowledge  that  the  reflectivity  of  any  point  on  the 
ground  can  not  be  negative.  In  addition  to  the  nonnegativity  constraint,  a  space  constraint  resulting  from  the 
known  space-domain  limits  on  the  object  of  interest  could  be  used.  Other  typically  available  constraints  include 

level  constrains  (which  impose  upper  and  lower  bounds  on  the  intensity  estimates  ),  smoothness  constraints 

(which  force  neighboring  pixels  in  the  restored  image  to  have  similar  intensity  values)  and  edge-preseiving 

constraints.  More  complicated  constraints  are  possible,  but  in  general  they  result  in  tuning  the  algorithms  to 
specific  classes  of  targets. 

Varying  by  the  extent  to  v/\sic\y  a  priori  knowledge  can  be  incorporated  in  algorithm  development,  there 
have  been  introduced  into  the  literature  a  large  number  of  image  restoration  approaches  and  algorithms  too  vast  to 
describe  or  reference  here.  One  may  refer  to  some  recent  survey  papers  [8,9]  for  a  review  of  the  extensive  activity 
on  this  topic.  In  this  section,  we  shall  only  briefly  cite  a  few  of  the  approaches  that  have  received  some  interest  in 
the  context  of  superresolution  capabilities,  i.e.  those  that  provide  possible  spectral  extrapolation.  It  should  be  noted 
clearly  that  not  afi  image  restoration  methods  provide  the  capability  for  superresolving.  In  fact,  a  majority  of 
existing  schemes  may  perform  decent  passband  restoration,  but  provide  no  bandwidth  extension  at  all. 

The  various  approaches  in  general  attempt  to  code  the  a  priori  knowledge  to  be  used  by  specifVing  an 
object  model  or  a  set  of  constraint  functions,  and  further  employ  an  appropnate  optimization  criterion  to  guide  in 
the  search  for  the  best  estimate  of  the  object.  A  convenient  way  of  classifying  the  resulting  algorithms  is  into 
iterative  and  noniterative  (or  direct)  schemes.  Noniterative  approaches  generally  attempt  to  implement  an  inverse 
filtenng  operation  (without  actually  performing  the  computation  of  the  inverse  of  the  PSF  matrix  H ,  however) 
and  have  poor  noise  characteristics.  All  required  computations  and  any  possible  use  of  constraint  functions  are 
applied  in  one  step.  In  contrast,  iterative  methods  apply  the  constraints  in  a  distributed  fashion  as  the  solution 
progresses  (as  shown  in  Fig.  2)  and  hence  the  computations  at  each  iteration  will  be  generally  less  intensive  than 
the  single-step  computation  of  noniterative  approaches.  Some  additional  advantages  of  iterative  techniques  are 
that,  (1)  they  are  more  robust  to  errors  in  the  modeling  of  the  image  formation  process  (uncertainties  in  the 
elements  of  the  PSF  matrix  H ,  for  instance),  (2)  the  solution  process  can  be  better  monitored  as  it  progresses,  (3) 
constraints  can  be  utilized  to  better  control  the  effects  of  noise  (and  possibly  clutter),  and  (4)  can  be  tailored  to 
offset  sensor  nonlinearities.  The  disadvantages  of  these  methods  generally  are,  (1)  increased  computation  time,  and 
(2)  need  for  proving  convergence  of  the  iterative  scheme  (in  fact,  for  some  algorithms  this  could  be  impossible).  In 
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the  development  of  an  efficient  processing  algorithm  for  a  specific  application  one  needs  to  evaluate  these 
tradeoffs  and  tailor  the  steps  of  the  algorithm  to  exploit  the  inherent  features  available  in  that  application. 
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Fig.  2a.  Schematic  of  Noniterative  (Direct)  Superresolution 
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Fig.2b.  Schematic  of  iterative  Superresolution 

3.  SPECIFIC  ALGORITHMS  FOR  IMAGE  RESTORATION  AND  SUPERRESOLUTION 

Among  the  several  approaches  that  utilize  iterative  or  noniterative  processing  techniques,  a  few  specific 
algorithms  have  been  receiving  a  greater  share  of  attention  in  regard  to  their  claims  for  restoration  and 
siiperresolution  performance  when  applied  to  multispectral  seeker  data.  During  this  projea,  we  focused  our 
attention  on  two  candidate  algorithms,  one  of  which  is  a  representative  of  the  noniterative  class  of  algorithms  and 
the  other  being  a  representative  of  the  iterative  class.  A  brief  outline  of  these  algorithms  which  is  followed  by  some 
qualitative  and  quantitative  performance  evaluations  that  will  highlight  the  strengths  and  weaknesses  of  the  two 
processing  approaches  will  be  given  in  this  section. 

3.1  A  Noniterative  Algorithm  Using  Matrix  Computations 

The  development  of  noniterative  algorithms  that  employ  simple  matrix  operations  has  been  a  popular  line 
of  investigation  in  recent  times.  One  of  the  more  well  known  algorithms  of  this  type  has  been  given  by  Gleed  and 
Lettington  [10]  using  a  regularized  pseudo-inverse  computation  approach.  Starting  with  the  space-domain  image 
formation  model  given  by  Equation  (4),  Gleed  and  Lettington  note  that  evaluating  the  solution  as 

f  =  H-'g-H-'n  (9) 

provides  a  poor  quality  estimate  due  to  the  noise  amplification  caused  by  //"’  (in  turn  due  to  some  eigenvalues  of 
H  becoming  too  small).  To  overcome  this  difficulty,  they  propose  to  modify  the  estimate  by  first  diagonalizing  the 
H  matrix  through  the  transformation 


22-11 


where  M  is  the  modal  matrix  of  //  and  A  is  a  diagonal  matrix  with  the  eigenvalues  of  H  along  the  diagonal 

A 

[11].  The  object  estimate  f  is  then  obtained  as 

A 

/  =  -^mld  S  -  •^mod'^  ( 1 0) 

where  is  computed  as 


^mld  =  A-'  ]M^  ,  (1 1) 

//(  >  0  is  a  scalar  parameter  to  be  selected  appropriately  based  on  the  noise  present  n . 

The  solution  given  by  (1 1)  changes  the  PSF  of  the  imaging  system  from  H  to  ,  however,  and  it  is 

necessary  to  account  for  this  change,  deed  and  Lettington  [10]  propose  a  “regularization”  operation  by 
constructing  the  matrix  R  as 


and  obtaining  the  final  estimate  as 


(12) 


/  ~  A  .  (13) 

In  Equation  (12),  /u^  is  another  user  selected  parameter  satisfying  the  condition  /^  <  //, .  Gleed  and  Lettington 
[10]  report  getting  satisfactory  resolution  improvements  in  processing  various  images  including  PMMW  imagery. 
The  exact  extent  of  spectral  extrapolation  obtained  by  this  method  is  however  not  clear.  Furthermore,  the  selection 
of  scalars  and  is  rather  ad  hoc. 


3.2  A  Maximum  Likelihood  (ML)  Algorithm  for  Iterative  Superresolution 

An  important  class  of  algorithms  that  are  receiving  particular  attention  in  recent  times  for  their 
superresolution  capabilities  are  those  that  can  be  developed  starting  with  a  statistical  modeling  of  the  imaging 
process.  The  basic  idea  underlying  these  methods  is  to  account  for  the  statistical  behavior  of  emitted  radiation  at 
the  level  of  individual  photon  events  by  constructing  appropriate  object  radiance  distribution  models  (using 
knowledge  of  fluctuation  statistics). 

For  a  brief  description  of  these  methods,  let  f{x)  denote  the  object’s  intensity  function,  x  &  X ,  where 

X  defines  the  region  over  which  intensity  is  defined,  and  let  g(,v  )  denote  the  intensity  detected  in  the  image, 

>'  e  F ,  where  Y  defines  the  region  over  which  intensity  is  detected.  If  { h{y,  x)  ,  g  F  and  x  G  A' }  denotes 

the  point  spread  function  (PSF)  of  the  imaging  sensor,  then  accounting  for  the  presence  of  noise  m  the  imaging 
process,  one  can  model  the  imaging  process  by 

^0)=  noise  (j4) 

xeX 
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(where  an  additive  noise  is  assumed  as  before  for  the  sake  of  simplicity).  The  classical  restoration  problem  is  to 

A 

find  the  object  intensity  estimate  { f(x) }  given  the  data  { ^O)  }. 

There  exists  considerable  literature  on  developing  explicit  algorithms,  mainly  of  an  iterative  nature,  for 
handling  the  image  restoration  problem  within  a  statistical  framework  afforded  by  such  a  formulation.  A 

A 

particularly  attractive  approach  is  to  obtain  a  maximum  likelihood  (ML)  estimate  { f  (x^ }  i.e.  the  object  intensity 
estimate  that  most  likely  have  created  the  measured  data  { g(y) }  with  the  PSF  process  { h(y,x) },  which  in  turn 
is  developed  by  maximizing  an  appropriately  modeled  likelihood  function  (or  the  logarithm  of  this  function,  for 
simplicity).  Modeling  the  likelihood  function  is  basically  obtaining  a  goodness-of-fit  (GOF)  quantity  for  the 
measured  data,  since  the  likelihood  function  is  a  statistical  distribution  function  p(glf)  obtained  as  a  fit  to  the 


relation  between  the  data  { g(y) }  and  the  object  { f(x) }.  The  success  of  image  restoration  in  a  given  application 
depends  on  how  good  the  assumed  conditional  probability  function  fits  the  input'output  characteristics  of  the 
imaging  system.  While  a  commonly  used  model  is  a  Chi-squared  function,  very  active  and  intense  research 
continues  to  this  day  on  the  development  of  improved  statistics  that  lead  to  better  ML  estimates. 

In  the  formulation  of  iterative  algorithms  that  afford  simple  implementation,  an  important  contribution 
has  been  the  work  of  Shepp  and  Vardi  [12]  who  used  the  Expectation  Maximization  (EM)  algorithm  (originally 
suggested  by  Dempster  et.  al.  [13])  to  solve  positron  emission  tomography  imaging  problems  in  which  Poissonian 
statistics  are  dominant.  The  major  advantage  of  using  the  EM  algorithm  is  that  it  involves  the  solution  of  linear 
equations,  whereas  the  original  ML  problem  is  in  general  a  nonlinear  optimization  problem.  Following  this 
approach,  Shepp  and  Vardi  [12]  developed  an  iterative  algorithm  for  which  convergence  can  be  proven 
analytically.  This  algorithm  also  reduces  to  the  familiar  Richardson-Lucy  iteration  which  has  attained  considerable 
popularity  in  the  fields  of  astronomy  and  medical  imaging.  For  a  discretized  formulation  of  the  imaging  equation 
(2),  with  g(J)  and  f(J)  ,  j  =  1,2,... T'/ ,  denoting  the  N'  pixels  of  the  image  and  the  object  respectively,  and 

h(J)  denoting  the  PSF  of  sensor,  the  updating  of  the  object  estimates  takes  the  form 


/,.,(y)  =  /.(y) 


f  i(J} _ 


Vf,U)®Kj)J 


®hU) 


j=\,2,3,-,N,  (15) 


A 

where  k  denotes  the  iteration  count  and  ®  denotes  discrete  convolution.  The  initial  estimate  /qC/)  is  taken  as 


the  image  g{j)  to  commence  the  iteration. 

A 

It  should  be  noted  that  the  ML  estimate  f{j)  attempts  to  construct  an  estimate  for  /  (J) .  the  number  of 
photons  emitted  by  the  j-th  sample  of  the  object  (which  is  considered  a  random  variable)  from  a  knowledge  of 
g(J) ,  the  j-th  pixel  value  in  the  input  image,  and  h(J)  ,  the  j-th  element  of  the  sensor  PSF.  The  optimization 
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framework  in  which  the  algorithm  is  developed  ensures  that  the  likelihood  function  p{glf)  monotonically 

increases  over  successive  iterations  of  the  algorithm  and  hence  the  processed  image  improves  in  quality  as  the 
algorithm  processed. 

3.3  Comparison  of  Algorithms  for  Multispectral  Seeker  Implementations 

Two  distinct  algorithms  for  restoration  and  superresolution  of  image  data  that  employ  iterative  and 
noniterative  computations  were  outlined  in  the  last  two  sections.  Due  to  the  differences  underlying  the  approaches 
followed  in  obtaining  these  algorithms,  they  possess  particular  advantages  and  disadvantages.  For  an  intelligent 
selection  of  the  right  approach  in  a  specific  seeker  application,  one  needs  to  weigh  these  advantages  and 
disadvantages  in  the  light  of  some  basic  requirements  that  need  to  be  met  in  these  implementations.  These  are 
listed  in  the  following: 

1 .  Flexibility  for  application  to  images  from  different  sensing  modalities; 

2.  Performance  robustness  to  tolerate  modeling  uncertainties,  parameter  inaccuracies  and  nonlinearities; 

3 .  Computational  requirements  that  can  be  met  in  typical  real-time  applications; 

4.  Ensure  desired  level  of  resolution  enhancement  in  the  presence  of  significant  noise  levels; 

5.  Ensure  satisfactory  performance  in  realistic  clutter  scenarios  (with  signal-to  clutter  ratios  (SCR)  in 
the  range  5-lOdB) 

The  limits  on  complexity  and  computational  requirements  are  evident,  given  the  real  time  operation 
reqmrement  for  a  missile  seeker.  On  the  surface  it  may  appear  that  noniterative  approaches  provide  an  obvious 
advantage  over  iterative  approaches  on  this  count  since  the  entire  computation  needs  to  be  performed  only  once. 
However,  there  are  several  factors  that  need  to  be  considered  in  this  evaluation.  First  of  all,  the  computation 
required  for  implementing  the  noniterative  estimation  given  by  (13)  is  considerably  more  complex  than  the 
implementation  of  the  iteration  given  by  (15).  It  must  be  emphasized  that  the  discrete  model  for  the  imaging 
process,  viz.  Equation  (4),  which  is  used  as  the  basis  for  the  noniterative  estimation  results  in  a  matrix  H  of 
typically  large  dimension.  Consequently,  the  matrix  inversion  operations  required  in  (18)  can  pose  some 
difficulties.  Furthermore,  H  will  have  a  number  of  zero  elements,  which  further  add  complexities  to  the 
computation  of  inverses.  In  our  implementations  we  have  found  that  execution  of  iterations  of  the  form  given  by 
(15)  over  several  cycles  is  in  many  cases  computationally  preferable  to  the  noniterative  algorithm  (18)  that  requires 
matrix  inverse  calculations. 

Perhaps  a  greater  advantage  of  iterative  algorithms  of  the  type  (15)  comes  from  the  performance 
robustness  to  noise,  clutter  and  parameter  uncertainties.  It  is  evident  that  an  accurate  knowledge  of  the  sensor  PSF 
matrix  H  is  necessary  for  computation  of  the  estimates  in  (13),  whereas  a  significant  tolerance  to  inaccuracies  in 
the  sensor  PSF  is  provided  by  the  ML  iteration  algorithm  (15).  In  fact,  this  algorithm  can  be  modified,  as  will  be 
described  in  the  next  section,  for  a  blind  implementation  when  a  complete  knowledge  of  the  sensor  PSF  is  not 
initially  available  and  one  would  like  to  obtain  improved  estimates  of  the  PSF  parameters  as  the  iterations  progress. 

The  requirements  arising  from  the  presence  of  noise  and  clutter  are  also  evident  from  the  practical 
environments  in  which  target  detection  and  classification  are  to  be  performed.  As  noted  earlier,  there  are  two  main 
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sources  of  noise  in  these  applications,  viz,  the  receiver  noise,  whose  statistics  depend  on  the  type  of  imaging  sensor 
employed  and  are  usually  signal  dependent,  and  the  quantization  noise,  which  can  be  realistically  modeled  by  a 
2gro-mean  white  Gaussian  random  field  that  is  independent  of  the  image  signal.  It  is  well  known  that 
deconvolution  methods  (particularly  those  that  attempt  to  implement  directly  an  inverse  filter)  are  highly  sensitive 
to  noise  and  require  rather  high  SNR  levels  for  satisfactorily  processed  images.  One  may  note  that  typical 
radiometric  images  (PMMW  images,  for  instance)  have  SNR  levels  of  about  20dB  (or  less),  thus  highlighting  the 
importance  of  this  requirement. 

Finally,  it  is  beneficial  to  have  the  restoration  algorithm  be  capable  of  processing  signals  collected  from 
different  types  of  sensors.  This  is  due  to  the  fact  that  present  day  missile  seekers  are  required  to  handle  data 
collected  from  different  sensing  modalities  (typically  operating  in  different  frequency  ranges)  and  to  perform 
tactical  decision-making  based  on  the  fused  data.  In  these  environments,  any  signal  processing  on  the  measured 
data  aimed  at  contrast  enhancement  [14]  and/or  resolution  enhancement  (superresolution)  often  comes  as  a  first 
step  operation  which  facilitates  the  further  processing  steps,  such  as  feature  extraction,  feature  integration  for 
fusion,  etc.  [15],  to  be  implemented  more  efficiently.  Use  of  iterative  algorithms  that  have  certain  inherent 
robustness  to  parameter  inaccuracies  can  provide  major  benefits  since  characterization  studies  leading  to  accurate 
determination  of  PSF  may  not  be  readily  available  for  every  detector  in  the  sensor  suite. 

The  qualitative  comparisons  given  in  this  section  favor  the  selection  of  iterative  algorithms  of  the  type 
given  by  (15).  While  convergence  of  iterative  schemes  is  in  general  an  issue  of  concern,  analytical  support  ensuring 
the  convergence  of  the  ML  iterations  in  (15)  is  available.  In  practice,  one  can  terminate  the  iterations  once  a 
desired  resolution  level  in  the  processed  image  is  attained.  Development  of  additional  constructs  that  speed  up  the 
convergence  of  the  ML  iterative  algorithm  given  by  (15)  is  a  topic  that  is  receiving  considerable  attention. 

3.4  Quantitative  Perforaance  Evaluation  of  ML  Superresolution  Algorithm 

Several  experiments  directed  to  evaluating  the  restoration  and  superresolution  performance  of  the  ML 
iterative  algorithm  given  by  (15)  were  conducted  as  part  of  this  project.  Results  from  four  experiments  are  briefly 
summarized  in  this  section  to  illustrate  the  efficiency  of  this  algorithm  in  the  processing  of  various  types  of  input 
signals.  Experiments  1-3  were  principally  aimed  at  measuring  the  spectrum  extrapolation  performance  and  hence 
were  conducted  in  a  “controlled  environment”  by  starting  with  a  known  object  (1-D  or  2-D)  and  blurring  it  with  a 
known  PSF  to  generate  the  input  signal  for  the  processing  algorithm.  On  the  other  hand,  experiment  4  dealt  with 
evaluating  the  performance  in  processing  “real  data”,  which  in  this  case  is  a  passive  MMW  image  collected  by 
Wright  Laboratory  Armament  Directorate  personnel  at  Eglin  AFB  using  a  state-of-the-art  passive  MMW  data 
acquisition  platform  comprising  of  a  95  GHz  1  foot  diameter  aperture  radiometer. 

Experiment  1: 

Fig.  3a  shows  the  original  object  comprising  of  3  impulses  (point  sources)  located  2  pixels  apart.  Fig.  3b 
shows  the  image  formed  when  blurred  with  a  sensor  with  a  cutoff  frequency  21.  Figs.  3c  and  3d  give  the 
reconstructed  images  after  90  and  1000  iterations  of  the  ML  algorithm.  While  the  restoration  of  the  original  object 
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Fig.  3.  Signal  restoration  performance  in  Experiment  1 
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is  clearly  seen  in  these  figures,  the  spectrum  extrapolation  (  and  hence  superresolution  )  can  be  more  clearly  seen 
by  examining  the  frequency  spectra  of  the  signals  before  and  after  processing.  Fig.  4a  shows  the  spectrum  of  the 
original  object.  Fig.  4b  shows  the  spectrum  of  the  image  formed  ( note  the  cutoff  frequency  2 1  of  the  sensor )  and 
Fig.  4c  shows  'the  spectrum  of  the  reconstructed  image  after  90  iterations  of  the  ML  algorithm.  The  extrapolation 
of  frequency  components  beyond  the  cutoff  at  2 1  clearly  demonstrates  that  this  algorithm  is  superresolving. 
Experiment  2: 

Fig.  5a  shows  a  more  complex  one-dimensional  object  characterized  by  several  edges  and  hence  offers  a 
greater  challenge  to  the  processing  algorithm.  The  image  formed  by  blurring  with  a  sensor  with  a  cutoff  frequency 
21  is  shown  in  Fig.  5b.  Fig.  5c  shows  the  reconstructed  image  after  90  iterations  of  the  ML  algorithm,  where  the 
edge  reconstruction  is  clearly  seen.  The  spectrum  extrapolation  performance  can  also  be  seen  by  comparing  Fig. 
6a,  6b  and  6c,  which  show  the  spectra  of  the  original  object,  input  image  and  the  ML  processed  image  (after  90 
iterations).  It  should  be  noted  that  only  the  portion  of  the  spectrum  in  the  high  frequency  range  is  plotted  to  an 
expanded  scale  in  these  figures  since  the  signal  has  low  frequency  components  with  relatively  large  magmtudes 
that  prevent  the  high  frequency  components  from  being  displayed  effectively  on  the  same  graph.  For  testing  the 
restoration  performance  in  the  presence  of  noise,  another  experiment  was  conducted  by  blumng  the  same  object  as 
before  and  then  corrupting  with  an  additive  Gaussian  noise  to  result  in  m  image  with  a  Signal-to-Noise  Ratio 
(SNR)  of  30.0248  dB  shown  in  Fig.  7a.  The  reconstructed  image  after  200  iterations  of  ML  algorithm  is  shown  in 
Fig.  7b  which  confirms  the  noise  filtering  properties  of  the  algorithm. 

Experiment  3: 

The  superresolution  performance  of  these  algorithms  was  also  tested  by  processing  a  256  x  256.  image 
from  our  database.  Fig.  8a  shows  the  original  image  (Lena)  used  in  this  experiment.  Fig.  8b  shows  the  blurred 
image  obtained  by  convolution  with  the  PSF  of  a  sensor  with  cutoff  co^  =  63  (which  is  approximately  one-half  of 
the  folding  frequency  and  is  typical  of  practical  imaging  operations).  The  restoration  performance  of  ML  algorithm 
as  the  number  of  iterations  is  gradually  increased  is  shown  in  the  next  set  of  figures  (Figs.  8c  -  8f  where  the 
restored  images  after  10,  20,  90,  and  100  iterations  are  shown).  The  resolution  enhancement  with  only  10 
iterations  of  the  algorithm  is  clearly  visible.  The  algorithm  was  stopped  at  the  end  of  100  iterations  since  the 
resolution  is  comparable  to  that  of  the  original  (unblurred)  image.  Figures  9a  -  9d  show  the  power  spectra  of 
various  signals;  the  frequency  extrapolation  performed  by  the  ML  algorithm  to  achieve  superresolution  is  clearly 
noticeable  by  comparing  the  four  comers  in  Figs.  9c  and  9d  w  hich  show  the  power  spectra  of  the  blurred  image 
and  the  reconstmcted  image  after  100  iterations  of  ML. 

Experiment  4: 

The  performance  of  the  ML  algorithm  was  also  tested  by  processing  a  set  of  passive  MMW  images 
supplied  by  the  Wright  Laboratory  Armament  Directorate  personnel.  These  images  were  collected  under  various 
conditions  (time-of-day,  atmospheric  conditions,  etc.),  which  were  not  known  save  for  the  fact  that  the  images  were 
obtained  by  a  single  detector  radiometer  with  1  foot  aperture  at  95  GHz.  For  illustration  purposes,  the  result  of 
processing  one  of  these  images  (“Jeep  2”  image)  will  be  presented  here. 
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Fig.  8.  Image  restoration  performance  in  Experiment  3 
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The  received  image  to  be  processed  is  shown  in  Fig.  10a.  It  is  evident  that  the  resolution  present  is  quite 
poor  and  the  extraction  of  any  useful  features  from  this  image  is  rather  doubtful.  The  image  can  benefit  from 
further  processing  aimed  at  resolution  enhancement.  Since  a  complete  characterization  of  the  sensor  and  the 
imaging  conditions  were  not  available  to  detemune  the  sensor  PSF,  an  approximate  analysis  was  conducted  to 
identify  the  cut-off  frequency  in  the  optical  transfer  function  (OTF).  Starting  with  an  intensity  profile 
corresponding  to  an  edge  of  the  object  in  the  image  (as  shown  in  Fig.  11a  which  gives  the  intensity  profile  in 
column  100),  an  OTF  was  created  as  a  low-pass  filter  whose  convolution  with  an  input  pulse  object  (shown  in  Fig, 
1  lb)  yields  an  outcome  (shown  in  Fig.  1  Ic)  which  is  similar  to  the  edge  profile.  The  cutoff  in  the  OTF  in  this  case 
was  estimated  to  be  17.  While  this  process  gives  only  a  rough  estimate  of  the  OTF  of  the  imaging  system,  through 
iterative  adjustments  it  is  possible  to  obtain  a  sufficiently  accurate  characterization  of  the  sensor  PSF  to  commence 
the  ML  iterations  for  superresolution. 

Figs.  10b  -  lOg  show  the  results  of  processing  the  input  image  in  Fig.  10a  after  10,  20,  30,  40,  60,  and  100 
cycles  of  ML  iteration.  The  progressive  enhancement  of  resolution  is  clearly  evident  from  the  strengthening  of 

edges  and  the  improved  structural  details  of  the  object  (particularly  near  the  wheels,  the  windows  and  the  top  of  the 
vehicle). 


4,  PERFORMANCE  IMPROVEMENT  BY  ITERATIVE  BLIND  ML  RESTORATION 

As  discussed  earlier,  if  the  sensor  is  fully  characterized  and  if  the  images  contain  good  target'scene 
metrics  such  as  exact  time  of  day,  distances  to  objects,  weather  conditions  etc.,  one  may  attempt  to  model  the 
sensor  PSF  exactly  and  employ  it  for  restoration  of  the  image  [16].  When  the  PSF  parameters  are  not  readily 
available,  one  may  attempt  to  build  an  approximation  to  the  PSF  from  the  image  to  be  processed  using  techniques 
such  as  the  one  described  in  the  last  section  (of  looking  for  a  column  or  row  in  the  image  corresponding  to  a  sharp 
edge  and  matching  this  profile  with  the  blurred  version  of  a  sharp  edge  passed  through  an  approximately  tailored 
OTF).  Since  the  performance  of  any  deconvolution  procedure  improves  with  the  availability  of  accurate  PSF 
parameters,  it  is  useful  to  consider  implementations  of  iterative  restoration  and  superresolution  algorithms  where 
the  PSF  estimates  can  be  adaptively  updated  along  with  the  iterative  construction  of  the  object  estimate.  Such 
implementations  can  be  regarded  as  special  cases  of  “blind  deconvolution”  algorithms  which  attempt  to  perform 

image  restoration  with  incomplete  knowledge  ofboth  the  PSF  and  the  object  (i.e.  /j  and  /  in  the  imaging  process 

model  (14)).  In  this  section  we  shall  present  one  such  implementation  and  discuss  its  performance  when  applied  to 
superresolution  of  PMMW  images. 

4.1  ML  Algorithm  for  Joint  Estimation  of  Object  and  PSF 

For  the  presentation  of  the  new  algorithm,  it  is  useful  to  briefly  review  the  mathematical  basis  underlying 
the  iterative  ML  scheme  given  by  (15).  Starting  with  the  imaging  equation  (14)  and  assuming  that  the  additive 
noise  can  be  modeled  as  an  independent  and  identically  distributed  (i.i.d.)  random  variable  with  a  zero  mean 
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Gaussian  probability  density  having  variance  cr^ ,  the  intensity  detected  in  the  image  gijy)  can  be  regarded  as  a 
random  variable  with  a  normal  probability  density 

p[g{ylhj]  = - rexp - -  .  (16) 

It  may  be  noted  that  the  noise  model  assumed  is  particularly  appropriate  if  the  dominant  noise  is  thermal  noise. 
The  probability  density  for  realizing  the  entire  data  set,  {g(y) ,  y  &  Y)  can  hence  be  obtained  as 

A 

which  serves  as  a  model  for  the  likelihood  function  for  evaluating  the  ML  estimate  {/ (x)}  that  most  likely  have 

produced  the  data  {^Ck)}  from  {h(y,x)}  and  {/(x)}  .  It  is  obtained  by  maximizing  P  in  (17). 

For  simplicity  in  computation,  one  conducts  the  maximization  of  a  modified  log-likelihood  function 

-|2 

L[gAf]  =  -Y.  ^(y)-Z^O>^)/W  (18) 

yex\_  xiX 

which  is  obtained  by  taking  the  natural  logarithm  of  P  in  (17)  and  removing  a  constant  term  that  does  not  affect 
the  maximization  process.  If  Z  is  maximized  with  respect  to  /  assuming  h  to  be  known,  we  have  the  standard 
ML  restoration,  whereas  if  Z  is  maximized  with  respect  to  both  /  and  h  by  searching  over  a  larger  parameter 
space,  we  have  blind  ML  restoration. 

The  iterative  scheme  given  in  (15)  provides  a  discretized  implementation  of  the  standard  ML  restoration 

A 

process.  It  can  be  easily  converted  into  a  blind  restoration  scheme  by  observing  that  f  (J)  and  h{j)  are 
interchangeable  in  the  two  convolution  operations  in  (15),  and  hence  an  updating  scheme  for  h(J)  can  be 

A 

developed  for  estimation  of  the  PSF  along  with  the  object  estimation,  i.e.  obtaining  /  (J)  .  For  a  brief  description, 
each  cycle  of  this  “Blind  ML  restoration  algorithm”  consists  of  executing  the  two  steps: 

Step  1:  Implement  “object  estimation”  algorithm  through  m  iterations  with  initial  guess  for  (J)  and 

fciJ)  =  g{j)- 

A.,  0) = /.  C/)  ®  K  ii)  (15) 
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Step  2:  Implement  ‘TSF  updating”  algorithm  through  n  iterations; 


The  algorithm  hence  continuously  reshapes  the  PSF  to  result  in  improved  object  estimation  after  each  cycle  of 
implementation.  A  flow-chart  depicting  the  various  steps  is  shown  in  Fig.  12.  The  algorithm  can  be  run  iteratively 
over  several  cycles  until  a  specified  maximum  iteration  count  is  reached  or  a  processed  image  with  a  satisfactory 
resolution  level  is  attained.  The  quality  of  estimation  depends  on  the  number  of  object  estimation  iterations  m  and 
PSF  updating  iterations  n  implemented  within  each  cycle.  Some  results  describing  the  performance  of  this 
algorithm  will  be  outlined  in  the  next  section. 


Fig.  12.  Flow-chart  for  implementation  of  blind  ML  restoration  algorithm. 
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4.2  Performance  of  Blind  ML  Restoration  Algorithm 

Several  experiments  have  been  conducted  to  test  the  performance  of  the  blind  ML  restoration  algorithm 
described  by  (19)  and  (20)  for  the  superresolution  processing  of  both  one-dimensional  and  two-dimensional 
signals.  A  few  of  these  will  be  described  in  the  following. 

Experiment  1: 

Starting  with  a  one-dimensional  object  consisting  of  three  pulses  of  uneven  heights,  shown  in  Fig.  13a,  a 
blurred  image,  shown  in  Fig.  13b,  was  obtained  by  convolving  with  a  sensor  with  cutoff  frequency  17  and  a  OTF 
whose  profile  is  shown  in  Fig.  13c.  For  commencing  the  blind  ML  iterations,  an  initial  estimate  of  PSF  was  made 
by  assuming  a  OTF  with  cutoff  frequency  12  (an  error  was  made  deliberately  to  test  the  performance  of  the 
algorithm  to  yield  estimates  progressively  moving  towards  the  true  cutoff  frequency  of  17)  and  taking  its  inverse 
Fourier  transform.  The  assumed  OTF  and  the  PSF  are  shown  in  Figs.  13d  and  13e. 

The  OTF  estimates  resulting  after  1  and  3  cycles  of  the  algorithm  are  shown  in  Figs.  14a  and  14b  which 
indicates  the  progression  of  the  algorithm  in  expanding  the  OTF  towards  the  true  cutoff  frequency.  Each  cycle  of 
algorithm  implementation  consists  of  five  object  estimation  iterations  (m  =  5)  followed  by  two  PSF  updating 
iterations  (n  =2).  The  final  results  after  10  cycles  of  algorithm  implementation  are  shown  in  Figs.  14c,  14d  and 
14e,  which  show  the  estimated  OTF,  the  corresponding  PSF  and  the  restored  object  respectively. 

Experiment  2: 

In  this  experiment  we  tested  the  performance  of  the  blind  ML  restoration  algorithm  in  processing  PMMW 
images  supplied  by  the  Wright  Laboratory  Armament  Directorate.  The  specific  image  used  as  input  to  the 
algorithm  is  the  “Parking  Lot  3”  image  which  is  shown  in  Fig.  15a.  For  obtaining  an  initial  estimate  of  the  PSF  to 
commence  the  iterations,  an  analysis  similar  to  the  one  described  earlier  of  matching  the  edge  profiles  was  made 
and  an  OTF  with  a  cutoff  frequency  of  82,  shown  in  Fig.  15b  (only  one  dimension  of  OTF  is  shown  here,  for 
simplicity),  was  developed.  The  ML  restoration  was  implemented  over  20  cycles  with  5  ML  estimation  iterations 
(m  =  5)  and  2  PSF  updating  iterations  (n  =  2)  in  each  cycle.  The  restored  images  at  the  end  of  5  cycles  and  20 
cycles  is  also  shown  in  Fig.  15e.  The  progressive  enhancement  of  resolution  is  clearly  evident  from  the  improved 
structural  details  of  the  building  and  the  parked  automobiles. 

For  obtaining  a  sense  of  the  degree  of  enhancement  of  high  frequency  components,  a  computation  was 
made  to  calculate  the  ratio  of  the  amplitudes  of  particular  spectral  components  in  the  processed  image  to  the 
amplitudes  of  the  same  components  in  the  original  image.  This  was  conducted  by  obtaining  the  2-dimensional  FFT 
of  the  original  image  (in  Fig.  15a)  and  extracting  the  first  row  of  the  Fourier  domain  amplitude  spectrum.  The 
selection  of  the  first  row  is  only  for  illustrative  purposes.  By  calculating  the  average  of  the  amplitudes  of  adjacent  5 
pixels,  an  amplitude  histogram,  shown  in  Fig.  16a,  was  developed.  Since  the  amplitude  values  in  the  low  frequency 
range  are  relatively  very  high  compared  to  those  of  the  high  frequency  components,  an  amplitude  thresholding  was 
used  to  display  the  high  frequency  portion  more  clearly,  as  shown  in  Fig.  16b.  The  corresponding  histograms 
computed  from  the  processed  images  after  5  and  20  cycles  of  ML  restoration  (i.e.  from  Figs.  15c  and  15d)  are 
shown  in  Figs.  16c  and  16d  respectively.  For  giving  a  better  quantitative  comparison,  the  amplitude  ratios  obtained 
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Fig.  16.  Frequency  enhancement  performance  of  Blind  ML  algorithm 
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at  the  same  frequency  values  were  evaluated  by  computing  the  amplitude  from  the  processed  image  divided  by  the 
amplitude  from  the  original  image.  Plots  of  these  amplitude  ratios  obtained  for  the  images  after  5  cycles  and  20 
cycles  of  processing  are  shown  in  Figs.  16e  and  16f,  which  clearly  demonstrate  the  enhancement  of  high  frequency 
spectral  components  by  the  present  ML  restoration  algorithm. 

5.  CONCLUSIONS 

Studies  directed  to  the  evaluation  of  performance  of  iterative  and  noniterative  schemes  for  image 
restoration  and  superresolution  processing  were  conducted  in  this  project  with  a  specific  focus  on  their  eventual 
deployment  in  multispectral  seeker  environments.  These  studies  are  of  particular  relevance  to  the  processing  of 
images  collected  from  various  sensors  used  in  these  environments  to  provide  enhanced  image  resolutions  that  can 
facilitate  better  false  target  rejection,  improved  automatic  target  recognition  and  aimpoint  selection.  The  selection 
of  an  iterative  technique  or  a  noniterative  scheme  results  in  considerably  different  performance  levels  and  can 
bring  specific  advantages  and  disadvantages  to  the  overall  restoration  and  superresolution  function  however.  Some 
qualitative  comparisons  of  the  performance  expected  from  these  schemes  were  given  in  this  report.  A  more 
quantitative  performance  evaluation  of  a  ML  superresolution  algorithm  was  conducted  in  this  project  and  results  of 
this  analysis  were  also  presented  here.  Finally,  to  aid  in  a  robust  implementation  of  the  ML  algorithm  in  cases 
where  an  accurate  model  of  the  sensor  PSF  is  not  available,  a  modified  algorithm  that  jointly  estimates  the  object 
imaged  and  the  PSF  parameters  fi-om  an  initial  approximated  guess  of  PSF  function  was  developed,  and  the 
performance  of  this  blind  ML  restoration  algorithm  in  processing  various  signals,  including  PMMW  images 
supplied  by  Wright  laboratory  Armament  Directorate  Personnel,  was  described.  Based  on  these  studies,  it  may  be 
concluded  that  ML  techniques  offer  an  attractive  framework  for  tailoring  image  restoration  and  superresolution 
algorithms  for  implementation  in  multispectral  seeker  environments.  Further  studies  directed  to  improving  the 
computational  efficiency  of  these  algorithms  and  to  evaluating  the  noise  tolerance  (tradeoff  between  resolution  and 
noise  filtering)  are  highly  useful  and  these  are  planned  for  future  investigation. 
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Abstract 

The  friction,  wear  and  adhesion  behaviors  have  been  investigated  for  the  TiC  coatings 
containing  different  kinds  of  interlayers  (Ti,  Cr  and  Mo)  of  different  thickness  using  pin-on-disk 
configuration,  in  which  both  stainless  steel  and  alumina  pins  were  used.  The  wear  tracks  of  the 
TiC  coatings  and  the  pins  were  characterized  by  both  optical  microscope  or  scanning  electron 
microscope  (SEM).  Inserting  Cr  or  Ti  interlayer  between  TiC  coating  and  substrate  could 
improve  greatly  the  adhesion  and  wear.  The  TiC  coating  with  500A  Cr  interlayer  showed  better 
overall  characteristics  than  that  with  5000A  Cr  interlayer.  The  same  was  found  to  be  true  for  the 
Ti  interlayer,  i.e.,  thinner  interlayers  performed  better  in  the  adhesion  and  wear  tests.  This  is 
probably  due  to  the  improved  lattice  match  between  the  TiC  coatings  and  the  interlayers  when 
the  interlayer  thickness  was  reduced.  The  wear  resistance  of  the  TiC  coatings  was  found  to 
increase  rapidly  as  their  thickness  was  changed  from  0.2  yum  to  2pm,  and  to  3  /um,  which  was 
closely  related  to  the  load  bearing  capability  of  the  coatings.  These  TiC  coatings  with,  or 
without,  metal  interlayers  have  been  nitrogen  ion  implanted  in  the  interface  region.  The  effects 
of  such  ion  implantation  on  their  adhesion  and  wear  are  currently  being  studied. 
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FRICTION  AND  WEAR  OF  TiC  COATINGS 


Jinke  Tang 


Introduction 

Because  of  the  many  advantages,  such  as  high  melting  point,  high  hardness,  wear  resistance, 
chemical  stability  as  well  as  large  modulus,  ceramic  coating  materials(e.g.,  TiC,  TiN,  SiC,  and 
diamond,  etc)  have  being  used  in  tribological  application  to  reduce  the  wear  of  contacting 
components.  The  applications  range  from  aerospace,  engine  parts,  to  wear  resistant  barriers  for 
many  moving  mechanical  assembles  [1,2].  Generally,  the  TiC  film  deposition  has  been  done  by 
chemical  vapor  deposition,  physical  vapor  deposition  [3,4],  magnetron  sputtering,  pulsed  laser 
deposition  (PLD)  [5,6],  etc.  Sessler  et  al.  [7]  investigated  the  influence  of  film  deposition 
temperature,  substrate  hardness,  and  counterface  hardness  on  the  fnction  and  wear  behavior  of 
TiC  film  grown  by  excimer  pulsed  laser  deposition.  The  thickness  of  TiC  coatings  was 
controlled  at  0.2  Tang  et  al.  [8]  found  the  adhesion  of  magnetron  sputtered  TiC  coating 
could  be  modified  by  inserting  metallic  interlayer  between  the  coating  and  stainless  steel  substrate. 
The  PLD  TiC  coatings  grown  at  room  temperature  was  harder  than  the  TiC  coatings  grown  by 
magnetron  sputtering  under  the  given  experimental  conditions. 

The  purpose  of  present  work  was  to  study  the  effects  of  different  metallic  interlayers  (e.g., 
Cr,  Ti,  and  Mo)  on  friction  and  wear  of  the  TiC  coating  grown  by  magnetron  sputtering.  The 
friction  and  wear  properties  of  the  coatings  containing  different  thickness  of  TiC  film  and 
interlayer  were  characterized  by  combining  tribological  experiments  with  the  optical  microscopic 
analysis. 

Experimental 

TiC  coatings  were  grown  using  the  magnetron  sputtering  technique.  The  440C  stainless  steel 
(SS)  substrates  were  first  rinsed  with  acetone  and  isopropanol  before  they  were  sputter  etched 
in  a  difiusion  pumped  MRC  902  in-line  sputter  deposition  chamber.  The  chamber  was  backfilled 


23-3 


with  methane  and  argon  at  constant  flow  rate  of  40  and  80-90  cmVm,  respectively.  The  bias 
voltage  on  substrate  was  -lOOV  and  total  pressure  was  controlled  at  8  mtorr.  The  chambers 
were  pumped  to  a  base  pressure  of  IxlO"*  torr  before  sputter  etching  of  the  substrates. 

Tribological  experiments  were  carried  out  on  an  ISC-200  pin-on-disk  tribometer  (Implant 
Sciences  Co.)  controlled  by  computer  at  room  temperature  in  laboratory  atmosphere.  The  sliding 
speed  of  pin  on  disk  was  constant  at  10.21  cm/sec  and  the  loads  in  all  tests  were  chosen  from  25 
gram  to  500  gram.  The  substrate,  a  pellet  of  25.5  mm  in  diameter,  was  made  of 440C  stainless 
steel  and  the  thickness  of  the  TiC  coating  was  0.2  ixm,  2  turn  and  3  //m,  respectively.  Alumina 
and  440  SS  pins  with  a  diameter  of  0.125  inch  were  alternatively  used  in  order  to  study  the 
influence  of  ball  hardness  on  fiiction  and  wear.  Wear  traces  of  the  coatings  and  wear  scars  of  pins 
were  analyzed  by  optical  microscope  or  scanning  electron  microscope  (SEM)  technique. 

Results  and  Discussion 

1.  Counterface  hardness  on  the  friction  and  wear  behavior  of  the  TiC  coatings 

The  friction  and  wear  experiments  were  conducted  using  a  pin-on-disk  configuration,  in 
which  the  two  types  of  pins,  440  SS  and  alumina  balls,  were  used.  Figure  1  shows  the  variation 
of  the  friction  coeflScient  with  the  SS  ball  sliding  over  the  surface  of  TiC  coating  with  a  thickness 
of  2  //m  under  the  loads  of  500  g,  250  g,  100  g  and  50  g,  respectively.  It  is  found  in  Figure  1  that 
the  fiiction  coeSicient  increased  abruptly,  to  as  high  as  0.5-0.9 ,  when  the  pin  started  to  move  on 
the  film.  Then  it  quickly  reduced  with  the  sliding,  finally  reaching  an  equilibrium  value.  The  value 
was  between  0.2-0.22  for  the  loads  of  50  g,  100  g,  and  250  g,  but  only  about  0.12  for  500  g. 
The  initial  coefficient  of  fiiction  in  all  of  above  experiments  was  much  higher  than  the  relevant 
equilibrium  coefficient  of  friction.  When  the  440  SS  ball  and  the  TiC  film  were  brought  into 
contact  and  sheared,  both  the  adhesion  between  hard  TiC  coating  and  soft  SS  pin  and  surface 
roughness  played  an  important  role  in  determining  the  initial  coefficient  of  friction.  With  the  SS 
pin  sliding  repeatedly  over  the  same  track  of  TiC  film,  the  surface  roughness  of  the  pins  and  TiC 
film  was  reduced  and  the  asperities  between  pin  and  coating  were  partly  filled  by  debris  (found 
in  the  following  optical  micrograph  of  the  pins).  Eventually,  a  stable  equilibrium  coefficient  of 
friction  was  reached.  It  is  noticed  easily  that  the  fiiction  coefficient  under  the  load  of  500  g  was 
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Figure  1  Friction  coefficient  of  the  TiC  coating(2um) 
on  440  stainless  steel  disk  as  a  function  of  wear  time 
under  the  loads:  500g,  250  g,  100  g  and  50  g, 
respectively.  (speed=10.21  cm/sec,  SS  pine) 
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only  0. 12,  much  lower  than  others.  A  coefficient  of  friction  of  0. 16  was  found  for  a  load  of  400 
g  (not  shown  in  Figure  1).  This  may  be  due  to  that  the  decrease  in  friction  resulting  from  blunting 
the  tips  of  surface  asperities  and  reducing  surface  roughness  is  greater  than  the  corresponding 
increase  resulting  from  larger  contacting  surface  when  the  sliding  continued  under  a  heavier  load 
force.  Friction  tracks  of  the  coating  and  wear  scars  of  the  SS  pin  shown  in  the  Figure  2  is  an 
evidence  of  that.  Wider  wear  track  on  the  film  and  larger  wear  surface  of  ball  occurred  under 
heavier  load,  i.e.,  500  gram.  This  load  dependence  of  the  fnction  seems  to  be  directly  related  to 
surface  roughness  and  the  contacting  surface  increase. 

Friction  and  wear  behavior  was  investigated  by  using  alumina  pin  under  the  same  conditions 
as  those  using  SS  pins  above.  Figure  3  presents  the  typical  fnction  traces  obtained  using 
alumina  pin  and  SS  pin  in  the  tests.  Both  of  the  equilibrium  coefficients  of  fnction  were  kept  in 
the  range  of  0.19-0.22.  But,  the  friction  coefficient  for  the  TiC/alumina  wear  couple  did  not 
increase  so  suddenly  as  that  on  the  TiC/SS  wear  couple  when  the  pin  started  to  sliding  on  the 
surface  of  TiC  film.  The  initial  coefficient  of  friction  of  TiC/alumina  couple  is  much  lower  than 
that  in  the  case  of  using  SS  pin.  After  sliding  of 2500  cycles,  the  coefficient  of  fnction  reached 
a  considerable  stable  value.  The  lower  initial  fnction  for  the  TiC/alumina  wear  couple  could 
probably  be  related  to  weak  adhesion  between  harder  alumina  ball  (compared  to  SS  balls)  and  TiC 
coating.  The  fiiction  traces  for  the  TiC/alumina  couple  in  Figure  3(b)  exhibited  a  strong  stick-slip 
characteristic  from  the  beginning  of  the  test,  especially  after  sliding  of  23000  cycles.  From  their 
micrographs  shown  in  Figure  4,  it  is  found  that  TiC  coating  was  scratched  and  apparently  worn 
through  by  alumina  pin,  but  the  contacting  surface  was  only  worn  a  little  for  the  TiC/SS  couple. 
That  equilibrium  coefficient  of  friction  did  not  increase  to  -0.6  (0.6- 1.0  for  SS  substrate)  is 
because  the  track  was  not  worn  through  completely.  A  little  scar  occurred  on  the  alumina  pin, 
however,  there  was  a  large  scar  on  the  SS  ball.  The  wear  mainly  happened  on  the  SS  ball  and  the 
TiC  coating  was  only  worn  off  to  some  extent  in  the  experiments  using  TiC/SS  couple. 

2.  TiC  coating  containing  metal  interlayers 
2.1.  2ixm  TiC  coating  with  interlayer 

Generally,  adhesion  between  substrate  and  coating  can  be  greatly  affected  by  interface 


23-6 


(C, 


Figure  2  Optical  micrographs  of  worn  surfaces  of  2//m  TiC  coating  and 
440C  stainless  steel  pin  under  different  applied  loads,  (a) —  250  g,  (b) —  500 
g,  (c)—  250  g,  (d)—  500  g.  (  Speed=  10.21  cm/s) 
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Figure  3  Friction  coefficient  of  the  TiC  coating  on  440 
stainless  steel  disk  as  a  function  of  wear  cycle  using 
both  alumina  pin  and  SS  pin.(  load=  250  g  and  speed 
=10.21  cm/sec) 
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Figure  4  Worn  traces  of  2^m  TiC  coating  under  the  condition  of  (a) 
stainless  steel  pin  and  (b)  alumina  pin.  (Load=250g,  Time=4  hours,  and 
speed=  10.21  cm/s) 
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structures.  Tang  and  co-workers  [8]  verified  that  the  adhesion  of  TiC  coating  was  improved 
greatly  by  inserting  metallic  interlayer,  such  as  Cr,  Ti.  In  this  study,  the  fnction  and  wear  has 
been  examined  on  the  TiC  coating  with  interlayers  grown  by  the  method  of  magnetron  sputtering, 
as  shown  in  Figure  5.  It  is  easily  seen  that  the  fnction  curve  for  the  TiC  coating  containing  Cr 
or  Ti  interlayer  had  less  stick-slip  than  that  for  the  TiC  coating  without  interlayer.  During  the 
rotation  of  23400  cycles  (about  3  hours),  the  sample  with  500A  Ti,  500A  Cr  or  5000A  Cr 
interlayer  showed  more  stable  fnction  and  wear  properties.  The  fnction  trace  for  5000A  Ti  was 
not  as  good  as  the  three  mentioned  above.  Equilibrium  coefficient  of  friction  for  all  of  the 
samples  kept  constant  at  0.19-0.22  during  the  sliding  processes. 

Their  optical  micrograph  results  are  presented  in  Figure  6.  Obviously,  the  TiC  coatings  were 
worn  off  to  different  degrees  and  the  islands  of  broken  coatings  were  formed  on  the  wear  tracks. 
Number  and  size  of  the  islands  varied  with  different  samples.  The  TiC  coating  containing  500A 
Cr,  50oA  Ti  and  SOOOA  Cr  had  less  such  delamination  than  that  of  SOOOA  Ti  interlayer.  The  TiC 
coating  with  500A  Cr  interlayer  showed  the  best  resistance  to  wear  in  all  of  the  samples.  This 
was  due  to  a  very  strong  adhesion,  and  maybe  hardness,  after  Cr  interlayer  was  inserted  between 
the  substrate  and  coating  [8].  The  photo  results  in  Figure  6  also  illustrated  that  fHction  and  wear 
behavior  was  better  for  the  coating  containing  the  thinner  (500A)  Cr  interlayer  than  that  with 
thicker  (SOOOA)  Cr  interlayer.  Although  the  coatings  showed  good  fnction  properties  after  Ti 
interlayer  was  inserted,  it  was  not  very  effective  when  the  inserted  Ti  interlayer  was  as  thick  as 
SOOOA.  The  largely  destroyed  coating  on  the  wear  trace  in  the  photograph  of  TiC  coating  with 

o 

SOOOA  Ti  mterlayer  was  consistent  wdth  the  large  stick-slip  phenomenon  seen  in  the  fnction  curve. 
On  the  wear  tracks  of  the  TiC  coating  containing  interlayers,  there  were  significant  streak  lines, 
especially  around  the  islands.  It  seems  that,  after  alumina  pin  was  slid  over  the  surface  of  TiC 
coating  for  some  times,  the  cracks  appeared  gradually  along  some  certain  directions  on  the  wear 
surface.  The  cracks  were  caused  by  the  ball  sliding  repeatedly  on  the  coating  surface  under  loads. 
It  is  interesting  that  the  streak  line  often  appeared  on  the  surface  of  TiC  coating  containing 
interlayer  Ti  or  Cr  under  load  of  100-250  g  (Figure  7).  In  contrast,  there  was  no  apparent  streak 
line  for  coatings  without  an  interlayer,  even  after  the  load  was  increased  up  to  500  g,  at  which  the 
coating  had  been  seriously  delaminated.  It  seems  that  the  TiC  coating  without  interlayer  was 
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Figure  5  Friction  coefficient  of  the  TiC  coating(2um)  without  and 
with  interlayers  on  stainless  steel  substrate  as  a  function  of 
number  of  passes  under  the  load  250  g  and  speedi  0.21  cm/s. 

A.  no  interlayer:  B.  500A  Ti;  C.  5000A  Ti;  D,  500A  Cr;  E,  5000A  Cr. 
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Figure  6  Optical  micrograghs  of  the 
worn  traces  for  2^.m  TiC  coating  with 
different  interlayers,  (a)—  500  A  Cr, 
(b)—  5000  A  Cr,  (c)—  500  A  Ti, 

(d) —  5000  A  Ti,  (e) —  no  interlayer. 
(Load=250  g,  Tiine=3  hours, 
and  Alumina  pin). 
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Figure  7  Worn  traces  of  Ijum  TiC  coating  containing  (a)  500  A  Ti 
interlayer  and  (b)  without  interlayer,  (a) —  1 00  g  load  (left)  and  1 50  g  load 
(right),  (b) —  100  g  load. 
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directly  delaminated  because  of  the  weak  adhesion  between  TiC  coating  and  SS  substrate,  while 
coatings  containing  Ti  or  Cr  interlayer  first  underwent  a  process  of  wear  track  cracking  before 
the  delamination  happened  on  the  contacting  surface.  This  wear  track  cracking  might  be 
attributed  to  the  existence  of  softer  Ti  or  Cr  interlayers,  but  the  enhanced  adhesion  due  to  the 
interlayers  was  able  to  prevent  substantial  delamination.  The  cracks  on  wear  track  should  be 
ominous  of  the  islands  for  the  TiC  coating  \\dth  Ti  or  Cr  interlayer.  There  was  fewer  cracked 
lines  on  fiiction  surface  of  TiC  coating  with  500A  Cr  interlayer.  The  surface  of  TiC  coating  with 
5000A  Ti  interlayer  was  seriously  scratched  after  3  hours  of  wear.  On  the  other  hand,  the 
alumina  pin  was  also  abraded  while  the  TiC  coating  was  worn  through.  It  can  be  seen  in  Figure 
8  that  a  large  ball  scar  was  formed  as  the  TiC  coating  was  scratched  seriously,  especially  for  TiC 
coating  with  5 000 A  Ti  interlayer. 

Figure  9  is  the  coefficients  of  fiiction  of  TiC  coatings  in  long  distance  tests  of  over  3500 
meters  (about  10  hours).  The  tests  were  done  to  further  examine  the  fiiction  and  wear  life  of  the 
coatings.  The  variation  of  curves  is  similar  to  that  of  Figure  5.  But,  the  fiiction  for  the  sample 
cont^ning  5000A  Ti  interlayer  and  without  interlayer  varied  unstably  in  the  later  stage  of  sliding. 
Friction  coeflBcient  for  TiC  coating  containing  500A  and  SOOOA  Cr  interlayers  started  to  increase 
after  sliding  of  ~3400  meters.  Combining  with  the  analysis  of  optical  micrograph  of  the  TiC 
coating,  as  shown  in  Figure  10,  we  found  that  the  wear  track  of  TiC  coatings  was  scratched  and 
delaminated  greatly  in  the  experiments,  especially  for  the  coating  without  and  with  SOOOA  Ti 
interlayers,  and  extended  broken  coating  was  formed  instead  of  islands.  Although  there  were 
heavy  scratch  on  the  surface  of  TiC  coating  with  500A  Ti,  500A  Cr  and  SOOOA  Cr  interlayers, 
their  friction  coefficients  varied  still  in  the  range  of  0.2-0.3  during  the  sliding  processes.  This 
indicates  that  even  through  part  of  TiC  coating  was  scratched  off  in  the  fiiction  test,  some  TiC 
debris  probably  filled  in  the  gap  between  the  contacting  surface  of  ball  and  TiC  coating  to  some 
extent  such  that  the  fiiction  coefficient  increased  only  slightly.  For  heavy  delamination  of  the  of 
the  TiC  film,  the  fiiction  coefficient  was  changed  significantly,  for  example,  in  the  case  of  TiC 
coating  without  or  with  SOOoA  Ti  interlayer.  Similar  to  the  results  in  Figure  5  and  6,  the  long 
scratch  on  the  wear  surface  was  accompanied  with  large  scars  on  the  ball  surface. 

The  behavior  of  friction  and  wear  of  the  TiC  coatings  could  be  improved  apparently  after 
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Figure  8  Optical  micrographs  of  the  alumina  pin  scar  coupled  with  TiC 
coating  containing  different  interlayers,  (a) —  500  A  Cr,  (b) —  5000  A  Cr,  (c)- 
-  500  A  Ti,  (d)—  5000  A  Ti. 
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Figure  9  Friction  coefficient  of  the  TiC  coating  (2um)  with 
and  without  interlayers  as  a  function  of  sliding  distance. 

A,  no  interlayer:  B,  500A  Ti;  C,  5000A  Ti;  D,  500A  Cr;  E, 
SOOOACr.  (  Load=250g  and  speed=10.21  cm/sec). 


Figure  10  Optical  micrographs  of 
the  2/ym  TiC  coating  with  different 
interlayers  after  a  w'ear  time  of  9. 8 
hours,  (a) —  500  A  Cr, 

(b)—  5000  A  Cr,  (c)—  500  A  Ti, 
(d)—  5000  A  Ti,  (e)—  no  interlayer. 
(load=250  g,  speed=10.21  cm/s,  and 
alumina  pin). 


Ti  or  Cr  element  was  inserted  between  TiC  films  and  stainless  steel  substrates,  although  the 
effects  on  the  tribological  properties  were  different  for  each  kind  of  the  interlayer.  The 
microstructure  of  the  coatings  have  been  characterized  with  scanning  electron  microscopy  (SEM) 
technique.  Figure  1 1  shows  the  cross-section  features  of  the  2  TiC  coatings  with  the 
interiayer  500A  Cr,  lOOOA  Cr,  5000A  Cr,  500A  Ti,  lOOOA  Ti,  and  5000A  Ti,  respectively.  The 
pictures  were  magnified  by  13,000.  The  interfaces  between  the  substrate  and  interlayer  Cr  or  Ti, 
and  between  the  interlayer  and  TiC  film  can  be  seen  clearly  in  the  figure.  It  is  known  that  the  TiC 
is  harder  than  the  substrate  and  interlayer  Cr  or  Ti.  Inserting  a  layer  of  soft  Ti  or  Cr  between  the 
TiC  film  and  substrate  should  be  able  to  dissipate  the  stress  due  to  applied  load,  which  could 
enhance  the  adhesion  of  the  TiC  coatings.  Compact  interfaces  between  coating,  interlayer,  and 
substrate  were  seen  except  the  sample  with  SOOOA  Ti  interlayer.  The  interface  between  the  TiC 
coating  and  the  5 000 A  Ti  interlayer  is  not  as  good  as  that  between  the  Ti  interlayer  and  the 
substrate.  This  interface  quality  should  have  direct  effects  on  the  adhesion  the  coatings.  The 
friction  and  wear  results  discussed  earlier  reflected  this  and  were  consistent  with  the  interface 
structural  analysis. 

Table  1  lists  the  volume  loss  of  alumina  and  stainless  steel  pins  sliding  over  2.0//m  TiC 
coating  with  and  without  interlayer.  Clearly,  The  rates  of  volume  loss  for  the  worn  SS  spherical 
pin  is  at  lest  twice  as  large  as  that  for  the  alumina  pin.  In  contrast,  the  results  have  also  proved 
that  TiC  coating  is  easier  to  be  worn  off  using  alumina  pin  than  using  SS  pin.  There  was  no 
apparent  difference  among  the  TiC  coatings  with  different  interlayers  and  without  interlayer  as 
alumina  pin  was  used  in  friction  experiments.  It  seems  that  the  volume  loss  of  SS  spherical  pin 
is  increased  with  the  increasing  of  the  applied  load,  especially,  there  was  a  large  difference  in  the 
rate  between  500  g  and  50  g. 

2.2.  3yum  TiC  coating 

Increase  of  the  thickness  of  TiC  coating  should  also  have  a  influence  on  the  friction  and 
wear  of  the  coating.  Figure  12  shows  the  friction  coefficient  of  3//m  TiC  coatings  without 
interlayer  and  with  interlayers,  lOOOA  Cr,  lOOoA  Ti,  and  lOOoA  Mo,  respectively,  as  a  function 
of  the  number  of  sliding  passes.  It  is  obvious  that  friction  for  the  coatings  with  Cr,  Ti,  as  well  as 
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Figure  1 1  SEM  images  of  the  cross-section  of  2jum  TiC  coating  with 
different  interlayers,  (a) —  500  A  Cr,  (b) —  5000  A  Cr,  (c) —  500  A  Ti, 
(d)—  5000  A  Ti. 
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Tabl  1  Volume  loss  of  worn  spherical  pins  coupled  with  TiC  coatings,  (a)— 


stainless  steel  pin,  (b) — alumina  pin 

(a) _ _ _ _ 


Load  (gram) 

500 

250 

100 

50 

Worn  Rate  of 
Pin  (mmVH) 

2.23x10-" 

8.00x10-* 

7.99x10-* 

3.67x10* 

Coating 

Components 

2  //m  TiC 
+5000  A  Ti 

2  jjm  TiC 
+500  A  Ti 

2  /im  TiC 
+5000  A  Cr 

2  yum  TiC 
+500  A  Cr 

2  iim  TiC 

Worn  Rate 
(mm^/H) 

3.00x10* 

2.97x10-* 

2.99x10* 

2.98x10* 

3.02x10* 
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Figure  1 2  Friction  coefficient  of  the  TiC  coatings(3um)  with 
and  without  interlayer  as  a  function  of  number  of  passes 
under  the  load  250  g  and  speed  10.21  cm/sec.  A,  no 
interlayer;  B.IOOOAMo;  C.  lOOOATi;  D,  lOOOACr. 
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without  interlayers  were  more  stable  and  smooth  than  that  containing  Mo  interlayer  during  23400 
cycles  of  wear.  This  is  probably  because  of  very  poor  adhesion  of  the  coating  containing  Mo 
interlayer,  so  that  the  TiC  coating  could  not  adhere  to  the  substrate  strongly.  The  scratch 
occurred  easily  on  the  TiC  coating  surface,  resulting  in  increased  friction  (Figure  12(b)).  The 
optical  micrographs  in  Figure  13  indicated  that  the  damage  on  the  surface  of  the  TiC  coating  with 
Cr  interlayer  was  the  smallest  in  all  of  the  wear  tests,  only  showing  a  beginning  of  the  wearing  off. 
In  contrast,  TiC  coating  containing  Mo  was  greatly  delaminated.  The  TiC  coating  with  Ti 
interlayer  showed  nearly  the  same  as  that  without  interlayer  in  friction  and  wear,  but  the  former 
had  smoother  friction  curve  than  the  latter.  When  the  sliding  distance  was  extended  to  over  3500 
meters,  it  is  concluded  further  from  Figure  14  that  TiC  coating  with  Cr  (or  Ti)  interlayer  has  a 
better  wear  life  than  the  TiC  coating  without  an  interlayer.  The  coefficient  of  friction  of  the  latter 
varied  over  a  large  range  and  increased  rapidly  in  the  later  stage  of  wear  test.  In  fact,  the  optical 
micrograph  results  gave  a  directly  evidence  for  such  observation  (Figure  15). 

Considering  the  influence  of  the  thickness  of  TiC  coatings  on  adhesion  and  wear,  we  found 
that  the  TiC  coating  of  3//m  thick  was  much  better  than  the  TiC  coating  of  2  yum  thick. 

2.3.  O.lixm  Tic  coating 

The  friction  and  wear  characteristics  of  the  coatings  should  be  affected  by  applied  load, 
especially  when  TiC  coating  is  very  thin.  Figure  16  and  17  demonstrate  the  friction  as  a  functions 
of  the  sliding  time  for  the  thin  TiC  coatings  (0.2  //m  in  thickness)  containing  200A  Ti  and  200A 
Cr  interlayers,  respectively,  under  different  loads.  The  coefficient  of  friction  of  the  TiC  coating 
with  Ti  interlayer  in  Figure  16  was  stable  during  the  sliding  of  3  hours  when  the  load  was  below 
50  g.  With  increasing  load,  the  friction  and  wear  of  the  TiC  coating  became  gradually  unstable, 
especially  when  load  exceeded  100  g,  due  to  the  severe  delamination  of  the  coating.  Similar  to 
Figure  16,  the  friction  and  wear  behavior  for  the  TiC  coating  with  Cr  interlayer  in  Figure  17  was 
better  under  a  load  of  50  g,  compared  to  a  load  of  100  g. 

Summary  and  Conclusion.s 

The  friction,  adhesion  and  wear  behaviors  have  been  investigated  for  the  TiC  coatings 
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(C;  Cd; 


Figure  1 3  Optical  micrographs  of  the  worn  traces  of  TiC  coating  with 
different  interlayers,  (a)—  no  interlayer,  (b)—  1000  A  Mo,  (c)—  1000  A  Ti, 
(d)—  1000  A  Cr.  (Load=250  g,  time=3  hours,  and  alumina  pin). 
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Coefficient  of  friction 


Sliding  distance  (m) 

Figure  1 4  Variation  of  the  friction  coefficient  of  TiC 
coatings  (Sum)  with  and  without  interlayer  with  sliding 
distance.  A,  no  interlayer;  B,  1000A  Ti;  C,  1000A  Cr. 
(Load=250  g;  sliding  speed=1 0.21  cm/sec). 
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Figure  1 5  Worn  traces  of  the 
3/^m  TiC  coating  with  different 
interlayers  after  a  worn  time  of 
9  hours,  (a)—  no  interlayer, 

(b)—  1000  A  Ti,  (C)—  1000  A  Cr. 
(Load=250  g,  speed=  10.21  cm/s, 
and  alumina  pin). 
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Wear  time  (min) 

Figure  16  Friction  coefficient  of  the  TiC  coating(0.2um)  with 
200  A  Ti  interlayer  as  a  function  of  wear  time  under  the 
different  loads,  A--  25  g,  B—  50  g,  C—  100  g,  and  D—  150  g. 
(speed=10.21  cm/sec). 
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Coefficient  of  friction 


Wear  time  (min) 


Figure  17  Friction  coefficient  of  the  TiC  coating(0.2um) 
with  200A  Cr  interlayer  as  a  function  of  wear  time  under 
the  load.  A—  50  g,  and  B—  100  g.  speed=10.21  cm/s. 


containing  different  kinds  of  interlayers  (Ti,  Cr  and  Mo)  of  different  thickness  using  pin-on-disk 
configuration,  in  which  both  stainless  steel  and  alumina  pins  were  used.  The  wear  tracks  of  the 
Tie  coatings  and  the  pins  were  characterized  by  both  optical  microscope  or  scanning  electron 
microscope  (SEM). 

1 )  The  equilibrium  coefficients  of  fiiction  were  found  to  vary  between  0. 1 9  and  0.22  for  the  TiC 
coatings  as  the  stainless  steel  and  alumina  pins  were  alternatively  used  in  our  experiments.  There 
was  significant  stick-slip  during  the  wear  tests  for  TiC/alumina  couple.  According  to  their  optical 
microscopic  results,  it  was  found  that  TiC  coating  was  scratched  and  apparently  worn  through 
by  alumina  pins,  but  only  slightly  by  stainless  steel  pins.  The  scars  on  the  alumina  pins  were  much 
less  severe  than  the  scars  on  SS  pins  during  the  wear  test.  The  rate  of  volume  loss  of  SS  pins 
was  more  than  twice  of  that  of  alumina  pins. 

2)  Inserting  Cr  or  Ti  interlayer  between  2  yum  TiC  coating  and  substrate  can  improved  greatly 
the  adhesion  and  wear.  It  is  found  that  the  TiC  coatings  containing  500A  Cr,  5000A  Cr,  500A 
Ti  and  5000A  Ti  interlayers  showed  more  stable  and  smoother  fiiction  than  that  without  interlayer 
during  3  hours  of  sliding  test.  The  microscopic  analyses  further  indicated  the  coating  with  500A 
Cr,  500A  Ti,  5000A  Cr  and  5000A  Ti  interlayers  had  stronger  adhesion  and  better  wear  than  the 
coating  without  interlayer.  The  TiC  coating  with  500A  Cr  interlayer  performed  better  than  that 
of  5000A  Cr  interlayer  in  friction  and  wear  tests.  The  same  was  found  to  be  true  for  the  Ti 
interlayer,  i.e.,  thinner  interlayers  performed  better  in  the  adhesion  and  wear  tests.  This  is 
probably  due  to  the  improved  lattice  match  between  the  TiC  coatings  and  the  interlayers  when 
the  interlayer  thickness  was  reduced. 

3)  When  the  fiiction  test  was  extended  to  about  10  hours  long,  that  is,  the  sliding  distance  of 
over  3500  meters,  the  TiC  coating  with  500A  Cr,  500A  Ti  and  5000A  Cr  interlayers  maintained 
good  fiiction  and  wear  behavior,  but  the  TiC  coating  with  5000A  Ti  or  without  interlayer  did  not. 
The  wear  surface  of  the  latter  has  been  destroyed  much  more  seriously  than  that  of  the  former. 

4)  The  wear  resistance  of  the  TiC  coatings  could  be  increased  by  changing  their  thickness  from 
0.2  ixTCi  to  2  pm,  and  to  3  //m.  This  was  closely  related  to  the  load  bearing  capability  of  coatings 
of  different  thickness.  The  friction  and  wear  behaviors  of  the  thin  TiC  coating  was  easily  affected 
by  applied  load.  For  load  of  150  g  or  above,  the  0.2//m  TiC  coating  with  0.02jum  Cr  or  Ti 
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interlayer  could  be  destroyed  by  alumina  pins  in  a  short  sliding  time. 

5)  During  3  hours  (about  23400  cycles)  of  sliding  tests  of  3  /^m-thick  TiC  coatings  with  lOOOA 
Cr,  1000 A  Ti  and  lOOOA  Mo  interlayers,  it  is  found  that  TiC  coatings  with  Cr  or  Ti  interlayers 
have  much  stable  and  smooth  friction  and  wear  behavior,  compared  to  that  containing  Mo 
interlayer.  This  is  because  of  the  poor  stress  bearing  capability  of  the  porous  Mo  film  deposited 
at  low  temperature.  The  degree  of  delamination  on  the  surface  of  TiC  coating  with  Cr  interlayer 
was  the  smallest  in  all  of  the  samples. 


23-29 


References 

1.  J.  C.  Angus  and  C.  C.  Hayman,  Science  241,  913(1988). 

2.  L.  Kempfer,  Mater.  Eng.  108,  28(1991). 

3.  G.Georgiev,  N.  Feschiev,  D.  Popov  and  Z.  Uzuuov,  Vacuum,  36(1986)595. 

4.  J.  F.Sundgren,  B.  O.  Johansson  and  S.  E.  Karlsson,  Thin  Solid  Films,  105(1983)353. 

5.  O.  Rist  and  P.  T.  Murray,  Mater.  Lett.,  10(1991)322. 

6.  M.  S.  Donley,  J.  S.  Zubinski,  W.  J.  Sessler,  V.  F,  Dyhouse,  S.  D.  Walck  and  N.  T.  McDevit, 
Matter.  Res.  Soc.  Symp.  Proc.  236(1991)461. 

7.  W.J.  Sessler,  M.  S.  Donley,  J.  S.  Zabinski,  S.  D.  Walck  and  V.  J.  Dyhouse,  Surface  and 
Coatings  Technology,  56(1993)125-130. 

8.  Tinke  Tang,  Jeffrey  S.  Zabinski  and  J.  E.  Bultman,  Surface  and  Coatings  Technology,  (1997) 
In  press. 


23-30 


DEVELOPMENT  OF  MASSIVELY  PARALLEL  EPIC  HYDROCODE 
IN  CRAY  T3D  USING  PVM 


C.T.  Tsai 

Associate  Professor 

Department  of  Mechanical  Engineering 


Florida  Atlantic  University 
777  Glades  Road 
Boca  Raton,  FL  33431 


Final  Report  for: 

Summer  Faculty  Research  Program 
Wright  Laboratory 


Sponsored  by: 

Air  Force  Office  of  Scientific  Research 
Bolling  Air  Force  Base,  DC 

and 

Wright  Laboratory 


December  1996 


24-1 


DEVELOPMENT  OF  MASSIVELY  PARALLEL  EPIC  HYDROCODE 
IN  CRAY  T3D  USING  PVM 


C.T.  Tsai 

Associate  Professor 

Department  of  Mechanical  Engineering 
Florida  Atlantic  University 


Abstract 


The  objective  of  this  report  is  to  verify  the  feasibility  of  converting  a  large  sequential  EPIC 
hydrocode  into  a  massively  parallel  EPIC  hydrocode  in  terms  of  computational  speed.  Sequential 
subroutines  in  the  Research  EPIC  hydrocode,  a  Lagrangian  finite  element  analysis  code  for  high 
velocity  elastic-plastic  impact  problems,  are  individually  converted  into  parallel  code  using  Cray 
Adaptive  Fortran  (CRAFT).  The  performance  of  massively  parallel  subroutines  running  on  32 
PEs  on  Cray-T3D  is  faster  than  their  sequential  counterparts  on  Cray-YMP.  At  next  stage  of  the 
research,  Parallel  Virtual  Machine  (PVM)  directives  is  used  to  develop  a  PVM  version  of  the 
EPIC  hydrocode  by  connecting  the  converted  parallel  subroutines  running  on  multiple  PEs  of 
T3D  to  the  sequential  part  of  the  code  running  on  single  PE.  With  an  incremental  increase  in  the 
massively  parallel  subroutines  into  the  PVM  EPIC  hydrocode,  the  performance  with  respect  to 
speedup  of  the  code  increased  accordingly.  The  results  indicate  that  significant  speedup  can  be 
achieved  in  the  EPIC  hydrocode  when  most  or  all  of  the  subroutines  are  massively  parallelized. 
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DEVELOPMENT  OF  MASSIVELY  PARALLEL  EPIC  HYDROCODE 

IN  CRAY  T3D  USING  PVM 


C.  T.  Tsai 


L  INTRODUCTION 

The  aim  of  research  and  development  of  weapon  systems  is  to  develop  one  which  can 
respond  quickly  to  the  combat  needs  of  operational  commanders.  Presently,  warhead 
design  involves  expensive  and  time  consuming  tests  done  on  prototypes.  The  computer 
simulation  of  warhead  penetration  can  save  huge  amounts  of  money  and  time  on  warhead 
design.  The  simulation  of  complex  penetration  behavior  requires  large  amount  of  CPU 
time,  even  when  run  on  fast  vector  processor  machine  like  the  Cray-YMP.  The  EPIC 
hydro  code  is  used,  at  the  Armament  Directorate  of  Air  Force  Wright  Laboratory,  to  solve 
high  velocity  elastic-plastic  impact  problems  involved  in  warhead  design. 

The  primary  objective  of  the  thesis  was  to  verify  the  feasibility  of  converting  a  large 
finite  element  analysis  code  into  a  massive  parallel  finite  element  code  in  terms  of 
computational  speed  and  cost.  The  Research  EPIC  hydro  code  ,  a  Lagarangian  finite 
element  analysis  code,  was  selected  for  parallization.  A  computational  intensive 
algorithm,  a  large  data  set  and  a  liberal  data  distribution  policy  was  a  motivation  for 
parallelizing  the  EPIC  code.  The  Cray-T3D  was  chosen  as  the  MPP  platform  as  it 
supports  MIMD/SPMD  model,  CRAFT  programming  model  and  PVM  message  passing 
programming  model.  More  importantly,  it  is  closely  coupled  with  other  Cray  PVP 
systems  like  Cray-YMP  on  which  the  EPIC  code  is  already  developed.  Related  goals 
included  developing  an  incremental  approach  of  parallization  using  PVM  message 
passing  paradigms  and  identifying  CRAFT  parallization  techniques  best  suited  for  the 
EPIC  subroutines. 


24-3 


The  next  section  gives  a  brief  introduction  to  parallel  processing  and  its  various 
applications.  Chapter  2  describes  the  EPIC  theory.  Cray-T3D  architecture  and  MPP 
programming  model  is  discussed  in  Chapter  3.  Chapter  4  states  the  performance  results  of 
the  individual  parallel  subroutines.  Chapter  5  describes  the  incremental  approach  and 
porting  of  EPIC  subroutines  using  PVM.  Chapter  6  contains  the  implementation  of  EPIC 
hydro  code  using  heterogeneous  PVM.  Chapter  7  include  conclusions  and  future  work. 

1.1  INTRODUCTION  TO  PARALLEL  PROCESSING 

The  ever  increasing  computational  needs  of  emerging  applications  have  been  the 
primary  motivating  factor  for  the  steady  increase  in  speed  of  the  traditional  serial 
computers.  However,  fundamental  physical  limitation  imposed  by  the  speed  of  light 
makes  it  impossible  to  achieve  further  improvements  in  speed  of  serial  or  single 
processor  computers  infinitely.  Recent  trend  show  that  the  performance  of  these 
computers  is  beginning  to  saturate.  (  It  is  often  remarked  that  speeds  of  basic 
microprocessor  grow  by  a  factor  of  2  every  1 8  months;  this  empirical  observation,  true 
over  many  years,  is  called  Moore's  law.).  A  natural  way  to  cireumvent  this  saturation  is  to 
use  an  ensemble  of  processors  to  solve  problems  [1]. 

Parallel  processing,  the  method  of  having  many  small  tasks  solve  one  large  problem, 
has  emerged  as  a  key  enabling  technology  in  modem  computing.  The  past  several  years 
have  witnessed  an  increasing  acceptance  and  adoption  of  parallel  proeessing,  for  both 
high  performance  scientific  computing  and  for  more  general  purpose  applications,  was  a 
result  of  the  demand  for  higher  performance,  lower  cost  and  sustained  productivity.  The 
acceptance  of  parallel  processing  has  facilitated  two  major  developments:  massive 
parallel  processors(MPPs)  and  the  use  of  distributed  computing. 

A  massively  parallel  processing  (MPP)  machine  combine  a  few  hundred  to  a  few 
thousand  CPUs  in  a  single  large  cabinet  connected  to  hundreds  of  Gbytes  of  memory. 
MPPs  offer  enormous  computational  power  and  are  considered  to  be  one  of  the  most 
powerful  eomputers  in  the  world.  MPPs  are  used  to  solve  computational  Grand  challenge 
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problems  such  as  global  climate  modeling,  determining  molecular,  atomic  and  nuclear 
structures.  As  simulations  become  more  realistic,  the  computational  power  required  to 
produce  them  grows  rapidly  and  that  is  when  MPPs  come  into  picture. 

The  second  major  development  affecting  scientific  problem  solving  is  distributed 
computing.  Distributed  computing  is  a  process  whereby  a  set  of  computers  connected  by 
a  network  are  used  collectively  to  solve  a  single  large  problem.  As  more  and  more 
organizations  have  high  speed  local  area  networks  interconnecting  many  general  purpose 
workstations,  the  combined  computational  resources  may  exceed  the  power  of  a  single 
high  performance  computer.  In  some  cases,  several  MPPs  have  been  combined  using 
distributed  computing  to  produce  unequaled  computational  power  [2]. 

1.2  MOTTVATTON  AND  APPLICATIONS  FOR  PARALLET.  COMPUTING 

The  traditional  scientific  paradigm  is  first  to  do  theory,  and  then  lab  experiments  to 
confirm  or  deny  the  theory.  The  traditional  engineering  paradigm  is  first  to  do  a  design 
and  then  build  a  laboratory  prototype.  Both  paradigms  are  being  replaced  by  numerical 
experiments  and  numerical  prototyping  for  the  following  reasons:  real  phenomena  are  too 
complicated  to  model  on  paper  (e.g.  climate  prediction)  and  real  experiments  are  too  hard 
and  too  expensive  for  a  laboratory  (e.g.  oil  reservoir  simulation,  large  wind  tunnel, 
overall  aircraft  design  etc.). 

Scientific  and  engineering  problems  requiring  the  most  computing  power  to  simulate 
are  commonly  called  "Grand  Challenges",  like  predicting  the  climate  few  years  hence,  are 
estimated  to  require  computers  computing  at  the  rate  of  1  Tera  flops  (i.e.,  1012  floating 
point  operations  per  second),  and  a  memory  size  of  1  TB  (Tera  Byte).  One  of  the  Grand 
Challenge"  climate  modeling  problem  is  illustrated.  In  a  simplified  climate  model, 
climate  is  defined  as  function  of  4  arguments:  longitude,  latitude,  elevation  and  time.  This 
in  turn,  returns  a  vector  of  6  values:  temperature,  pressure,  humidity,  and  wind  velocity  ( 
3  variables  actually).  To  represent  the  continues  function  in  the  computer,  the  domain  is 
descretized  and  climate  is  evaluated  for  the  arguments  lying  on  a  grid:  climate  (i,  j,  k,  n). 
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where  t  -  n*dt,  where  dt  is  a  fixed  time  step,  n  an  integer,  and  i,  j,  k  are  integers  indexing 
the  longitude,  latitude  and  elevation  grid  cells,  respectively. 

An  algorithm  to  predict  the  weather  (short  term)  or  climate  (long  term),  is  a  fimction 
which  maps  the  climate  at  time  t.  Climate  (i,  j,  k,  n),  for  all  i,  j,  k,  to  the  climate  at  the  at 
the  next  time  step  t+dt,  climate  (i,  j,  k,  n+1),  for  all  i,  j,  k.  The  algorithm  involves  a 
system  of  equations  including,  in  particular  the  Navier-Stokes  equations  for  the  fluid  flow 
of  gases  in  the  atmosphere.  Then,  the  earth's  surface  is  discretized  into  1  kilometer  by  1 
kilometer  cells  in  the  latitude  -longitude  direction,  and  10  cells  in  the  vertical  direction. 
From  the  surface  area  of  the  earth,  we  can  compute  that  there  are  about  5*109  cells  in  the 
atmosphere.  With  six  4-byte  words  per  cell,  the  memory  requirement  is  about  0.1  TB. 

Assuming,  it  takes  100  flops  to  update  each  cell  by  one  minute.  Or  in  other  words,  if 
dt  =  1  minute,  and  computing  climate  (i,  j,  k,  n+1)  for  all  i,  j,  k  from  climate  (i,  j,  k,  n) 
take  about  100*5*109  or  5*1011  floating  point  operations.  For  such  kind  of 
computations,  the  computing  speed  has  to  be  atleast  8  Gflops.  Weather  prediction 
(computing  24  hours  to  compute  the  weather  7  days  hence),  requires  computing  50*12  = 
600  times  faster,  or  4.8  Tflops  machine. 

The  actual  grid  resolution  used  in  climate  modeling  today  is  about  4  degrees  latitude 
by  5  degrees  of  longitude  (about  460  km  by  560  km),  a  rather  coarse  resolution.  A  near 
term  goal  is  to  decrease  this  grid  size  using  parallel  computing  techniques,  such  that 
different  grid  data  are  stored  in  different  processors  and  relevant  simultaneous  equations 
are  solved  parallely  on  different  processors  [3]. 

Selected  application  areas  in  the  field  of  parallel  computing  are  as  follows: 

•  Weather  and  Climate 

0  Prediction  of  weather,  climate  and  global  change 

•  Biology 
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0  Determination  of  molecular,  atomic  and  nuclear  structure. 

0  Mapping  the  human  genome  and  understanding  the  structure  of  biological 

macromolecules . 

•  Mechanical  Engineering 

0  Finite  element  Methods. 

0  Particle  methods  in  aerospace. 

0  Understanding  turbulence,  pollution  dispersion  and  combustion  system. 

0  Computational  Fluid  Dynamics. 

•  Chemical  Engineering 

0  Gas  hydrate  simulation. 

0  Simulation  of  fluids  in  pores. 

•  Environmental  modeling 

0  Modeling  whole  ecosystems. 

0  Assessment  of  pollution  remediation. 

•  Material  science 

0  Understanding  the  nature  of  new  materials. 

0  Exploring  theories  of  matter. 

•  Demography 

0  Interactive  access  to  large  databases. 

1.:^  CT  ASSIFTCATTON  OF  PARAJ.T.EL  COMPUTERS 

Parallel  computers  are  classified  based  on  various  dimensions  like  control 
mechanism,  address-space  organization,  interconnection  network  and  granularity  of 
processors. 

BASED  ON  CONTROT.  MECHANISM 

Parallel  computers  are  classified  as  single  instruction  stream,  multiple  data  stream 
(SIMD)  and  multiple  instruction  stream,  multiple  data  stream  (MIMD).  Processing  units 
in  parallel  computers  either  operate  under  the  centralized  control  of  a  single  control  unit 


or  work  independently.  In  architectures  referred  to  as  SHVED,  a  single  control  unit 
dispatches  instructions  to  each  processing  unit.  Here,  the  same  instruction  is  executed 
synchronously  by  all  processing  umts.  Examples  of  SIMD  parallel  computers  are  MasPar 
MP-1,  MasPar  MP-2,  MPP,  DAP  and  CM-2.  Computers  in  which  each  processor  is 
capable  of  executing  a  different  program  independent  of  the  other  processors  are  called 
multiple  instruction  stream,  multiple  data  stream  (MEMD)  computers.  Example  of  MIMD 
computers  are  Cosmic  Cube,  nCUBE-2,  iPSC,  CM-5,  Paragon  XP/S  and  Cray-T3D. 


SIMD  computers  require  less  hardware  than  MIMD  as  they  have  only  one  global 
control  unit.  SIMD  also  requires  less  memory  as  only  one  copy  of  the  program  needs  to 
be  stored.  In  contrast,  MIMD  computers  store  the  program  and  operating  system  at  each 
processor.  SIMD  computers  are  naturally  suited  for  data  parallel  programming.  Though 
individual  processor  in  an  MIMD  computer  are  more  complex,  general  purpose 
microprocessors  may  be  used.  Hence,  due  to  the  economy  of  scale,  processors  in  MIMD 
computers  are  both  cheaper  and  more  powerful  than  processors  in  SIMD  computers. 


Figure  1-1:  (a)  Typical  SIMD  architecture  and  (b)  Typical  MIMD  architecture 


U,2  BASED  ON  ADDRFSS-SPACE  ORGANTZATTOTV 

Solving  a  problem  on  an  ensemble  of  processors  requires  interaction  among 
processors.  Message  Passing  and  shared  address  space  architectures  provide  two  different 
means  of  processor  interaction.  In  a  message  passing  architecture,  processors  are 
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connected  using  a  message  passing  interconnection  network.  Each  processor  has  its  own 
memory  called  local  memory,  which  is  accessible  only  to  that  processor.  Processors  can 
interact  only  by  passing  messages.  It  is  also  referred  to  as  distributed  memory 
architecture.  Examples  include  nCUBE-2,  CM-5,  Cosmic  Cube  and  Paragon  XP/S.  The 
shared  addressed  space  architecture  provides  hardware  support  for  read  and  write  access 
by  all  processors  to  a  shared  addressed  space.  Most  shared  address  space  computers 
contain  a  shared  memory  that  is  equally  accessible  to  all  processors  through  an 
intercoimection  network.  These  architectures  are  called  shared  memory  parallel 


computers.  Examples  include  C.mmp  and  NYU  Ultracomputer.  A  drawback  of  these 
architectures  is  that  the  bandwidth  of  the  interconnection  network  must  be  substantial  to 
ensure  good  performance.  The  Cray-T3D  is  a  logically  shared,  physically  distributed 
memory. 


P:  Processor 
M:  Memory 


Figure  1-2:  Message  Passing  architecture 


Figure  1-3:  Shared-address  space  architectures 


24-9 


1.3.3  BASED  ON  INTERCONNECTION  NETWORKS 


Shared  address  space  computers  and  message  passing  computers  can  be  constructed 
by  connecting  processors  and  memory  units  using  a  variety  of  interconnection  networks. 
Classification  of  parallel  computers  based  on  interconnection  networks  are  static  and 
dynamic.  Static  networks  consists  of  point  to  point  communication  links  among 
processors  and  are  also  referred  to  as  direct  networks.  Static  networks  are  typically  used 
to  construct  message  passing  computers.  Dynamic  networks  are  built  using  switches  and 
communication  links.  Communication  links  are  connected  to  one  another  dynamically 
using  switching  elements  to  establish  paths  among  processors  and  memory  banks. 
Dynamic  networks  are  referred  to  as  indirect  networks  and  are  normally  used  to  construct 
address  space  computers. 

1.3.4  BASED  ON  INTERCONNECTION  NETWORKS 

A  parallel  computer  may  be  composed  of  a  small  number  of  very  powerful  processors 
or  a  large  number  of  relatively  less  number  of  processors.  Processors  belonging  to  the 
former  class  are  called  coarse-grain  computers,  and  those  belonging  to  the  later  are  called 
fine-grained  computers.  Examples  of  coarse-grain  computers  are  Cray-YMP,  Cray-C90 
which  offer  a  small  number  of  processors  each  capable  of  several  Gflops  and  in  contrast, 
a  fine  grain  computer,  examples  like  CM-2,  MasPar  MP-1  and  MasPar  MP-2,  offer  a 
large  number  of  relatively  slow  processors.  MasPar  MP-1  contains  up  to  16,384  four-bit 
processors.  There  is  also  a  class  of  parallel  computers  between  the  extremes  and  are 
called  medium  grain  computers,  include  CM-5,  nCUBE-2  and  Paragon  XP/S. 

The  granularity  of  a  parallel  computer  can  be  defined  as  the  ratio  of  the  time  required 
for  a  basic  communication  operation  to  the  time  required  for  a  basic  computation.  Parallel 
computers  for  which  this  ratio  is  small  are  suitable  for  algorithms  requiring  fi-equent 
communication:  that  is,  algorithm  in  which  the  grain  size  of  the  computation  is  small. 
Since  such  algorithms  contain  fine-grain  parallelism,  these  parallel  computers  are  often 
called  fine-grained  parallel  computers.  In  contrast,  parallel  computers  for  which  this  ratio 
is  large  are  suited  for  algorithms  that  do  not  require  frequent  communication.  These 
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computers  are  referred  to  as  coarse-grain  computers.  According  to  this  criterion,  multi 
computers  such  as  the  nCUBE  2  and  Paragon  XP/S  are  coarse  grain  computers,  whereas 
multiprocessors  such  as  the  C.mmp,  TC-2000  and  KSR-1  are  fine  grain  parallel 
computers.  The  Cray-T3D  is  a  moderately  coarse  parallel  computer. 
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2.  EPIC  RESEARCH  HYDRO  CODE 

2.1  EPIC  THEORY 

There  are  many  different  computer  codes  available  to  calculate  dynamic  response  to 
impact.  For  impact  problems  involing  elastic-plastic  flow  with  large  displacements,  the 
solutions  have  most  often  been  obtained  with  Lagrangian  code.  The  EPIC  code  is  also  an 
Lagrangian  FEA  code  developed  bv  Dr.  G.R.  Johnson  et  al,  of  Alliant  Techsystems  Inc.  It 
takes  advantage  of  the  fact  that  triangular  or  tetrahedral  element  formulation  is  better  suited 
to  represent  the  severe  distortions  than  is  the  traditional  quadrilateral  or  hexahedral  finite 
difference  methods  [4,5]. 

The  finite  element  method  is  implemented  in  the  following  steps: 

•  The  geomehy  of  the  problem  is  represented  with  elements  of  triangular  cross-section 
having  specific  material  characteristics. 


Z 


Figure  2-1:  Geometric  properties  of  triangular  elements 


•  The  distributed  mass  at  the  nodes  is  lumped.  The  initial  velocities  to  represent  the 
motion  at  impact  is  assigned. 
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The  lumped  mass  in  each  of  three  node  is: 


Mi  =  Mj  =  Mm  =  (l/3)VoPo 

where  Vg  and  r^,  are  initial  volume  and  density  of  the  element. 

•  Numerical  integration  loop  works  as  follows: 

0  The  strain  and  strain  rates  in  the  elements  are  determined. 

S  1  1/2 

£  =  \(2/9)  [(£r  -  ej  +  (£r  -  eef  +  (£^  -  Sef  +  f3/2Jr^jj 

where  e  is  equivalent  strain,  £r,£0,£2>^d/r2  strains  relative  to  system  axes. 

0  Stresses  in  the  element  are  determined.  The  stresses  consists  of  elastic  stresses,  plastic 
deviator  stresses,  hydrostatic  pressure,  and  artificial  viscosity. 

Elastic  stresses  are  obtained  by  Hooke's  law. 

(Jr  =  ^£:v  +  2G£r  -  Q 

(Jz  =  2  Sv  +  2G  £^  -  Q 

CTe  =  2.  Sv  +  2G  £0  -  Q 

=  Gy  ^ 

where  (Tr.cTz’CTe.^^Trz  are  radial,  axial,  circumferential,  and  shear  stresses,  and  X 
and  G  are  Lame's  elastic  constants. 

These  stresses  are  combined  to  form  equivalent  stress  cr. 
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cr  -  ^(1/2)  ^(ar  -  azf  +  ((Tr  -  aef  +  ((Jz  -  aef  +  ^^4] 

This  stress  represents  the  overall  state  of  stress  within  the  element. 

Plastic  flow  begins  when  the  elastic  strength  of  materials  is  exceeded.  When  this 
occurs,  the  normal  stresses  are  obtained  by  combining  hydrostatic  pressure  with 
the  plastic  deviator  stresses  and  the  artificial  viscosity. 

Or  ^  Sr  -  (P  +  0) 

tJz  =  Sz  -  (P  +  0) 

(Je  ^  se  -  (P  +  Q) 

where  5,., 5^, and 5^  are  the  plastic  deviator  stresses,  P  is  hydrostatic  pressure  and  Q 
is  the  artificial  viscosity.  The  plastic  deviator  stresses  represent  the  shear  strength 
characteristics  of  the  material. 

The  hydrostatic  pressure  is  determined  fi'om  the  Mie-Gruneisen  equation  of  state. 

P  =  (K,M  +  K2M  +  K3h')[1  -  Tm/2)]  +  Yp^E 

where 

p  =  (p/pj  -  1  =  (Vo/V)  -  1 

The  specific  internal  energy,  E,  is  obtained  fi'om  the  work  done  on  the  element  by 
various  stresses,  Ki,K2.^^^K3  are  material  dependent  constants  and  is  the 
Gruneisen  coefficient. 
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The  artificial  viscosity  is  combined  with  the  normal  stresses  to  damp  out  localized 
oscillations  of  the  concentrated  masses.  It  is  applied  only  when  volumetric  strain 
rate  is  negative. 


Q  =  Cl[(^  +  2G)pA]1/2\eA  +  ClpA(£j'foxs.  <  0 
Q  =  OfOTSy,  >  =  0 

where  recommended  value  of  coefficients  are  Cl  =  0.5  and  d  =4.0. 

0  Equivalent  concentrated  forces  that  act  on  nodalmasses  is  determined.  The  radial 
and  axial  forces,  and  F^j,  acting  on  node  i  of  an  element  are: 

Fri  =  -  ^[(zj  -  Zm)(Tr  +  f I'm  '  rj)TrzJ  ”  (2/3)7iAcre 
Fzi  =  -  ^[(rm  -  rj)crz  +  (zj  -  Zm)TnJ 


where  the  nodal  coordinates  represent  the  displaced  geometry. 

0  Intergration  time  increment  is  determined. 

Maintaining  a  numerically  stable  solution  for  dynamics  problem  is  generally 
accomplished  bv  using  a  numerical  integration  time  increment  which  is  sufficiently 
less  than  the  lowest  period  of  vibration  of  the  system. 

0  Equation  of  motion  to  the  nodes  are  applied  for  integration  time  increment.  The 
equations  of  motion  can  be  numerically  integrated  by  assuming  a  constant 
acceleration  for  each  time  increment.  The  radial  acceleration  for  each  time 
increment.  The  radial  acceleration  of  node  I  at  time 't'  is: 

W  =  (TFri/I^Mi) 

The  new  displacement  at  time  =  t  +  Dt  is 
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Ut+ht  -  Ut  ut^t  +  (1  / 2)ut(At f 

and  the  new  velocity  is 

Ut+Ai  =  Mr  +  M/Af 

The  equation  of  motion  for  the  axial  direction  have  a  similar  form.  After  the 
equation  of  motion  are  numericallv  integrated,  an  integration  cycle  is  complete. 

•  The  numerical  intergration  loop  is  repeated  until  the  time  of  interest  is  elapsed. 
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PARAT  J  FT,  programming  TECHNIQUES  ON  CRAY-T3D 


1.1  BACKGROUND 

The  first  step  in  parallel  programming  is  the  development  of  the  parallel  algorithm  to 
solve  a  problem.  There  are  two  different  approaches  to  algorithm  development.  The  first 
approach  is  when  a  sequential  algorithm  is  already  present  and  with  minor  modifications 
can  be  changed  into  a  parallel  program  using  parallel  programming  paradigms.  The  first 
approach  may  not  succeed  in  many  cases  where  the  sequential  algorithm  has  too  many 
bottlenecks.  In  that  case,  parallel  algorithm  has  to  be  developed  which  may  have  totally 
different  approach  as  the  sequential  algorithm. 

To  run  a  parallel  algorithm  on  a  parallel  computer,  one  needs  to  implement  them  on  a 
programming  language.  In  addition  to  providing  all  the  functionality  of  a  sequential 
language,  a  language  for  programming  parallel  computers  must  provide  mechanisms  for 
sharing  information  among  processors.  It  must  do  so  in  a  way  that  is  clear,  concise,  and  is 
readily  accessible  to  the  programmer.  Different  parallel  programming  languages  enforce 
different  programming  paradigms.  The  variations  among  paradigms  are  motivated  by 
several  factors.  First,  there  is  a  difference  in  the  amount  of  effort  invested  in  writing 
parallel  programs.  Some  languages  require  more  work  fi-om  the  programmer,  while  others 
require  less  work  but  yield  less  efficient  code.  Second,  one  programming  paradigm  may 
be  more  efficient  than  others  for  programming  on  certain  parallel  progranuning 
architectures.  Third,  various  applications  have  different  types  of  parallelism,  so  different 
programming  languages  have  been  developed  to  exploit  them. 

rRAY-TlD  ARCHITECTURE  OVERVIEW 

The  introduction  of  Cray-T3D  by  CRI  in  late  1993  has  been  a  significant  event  in  the 
field  of  massively  parallel  supercomputing.  The  T3D  promises  a  major  advance  in  highly 
parallel  hardware  with  respect  to  low  latency  (1  micro  second)  and  high  bandwidth  (125 
MB/sec)  interconnect.  The  Cray-T3D  is  scalable  to  2048  processor  elements  and  300 
Gflops.  It  is  a  multiple  instruction  multiple  data  (MIMD)  architecture  machine.  It  has  a 
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logically  shared  and  physically  distributed  DRAM.  Cray-T3D  has  a  two-tier  architecture 
consisting  of  macro  architecture  and  micro  architecture. 


Figure  3-1 :  Cray-T3D  two  tier  architecture 

3.2.1  MICRO  ARCHITECTURE 

The  micro  architectme  will  vary  as  technologies  advance  to  achieve  Tflops  of 
sustained  performance.  Micro  architecture  refers  to  the  microprocessor  chip  used  in  the 
Cray-T3D.  For  its  first  generation  machines  (T3D),  CRI  choose  DEC  chip  21064  (Alpha 
chip)  for  its  performance,  features,  technology  and  availability.  The  alpha  chip  consists  of 
four  main  components  IBOX,  EBOX,  FBOX  and  ABOX. 

(a)  Central  control  unit  (IBOX):  The  IBOX  performs  instruction  fetch,  resource 
checks,  and  dual  instruction  issue  to  the  EBOX,  ABOX  and  FBOX  or  branch  unit.  It 
handles  pipeline  stalls,  aborts  and  restarts. 

(b)  Integer  execution  unit  (EBOX):  The  EBOX  contains  a  64-bit  fully  pipelined 
integer  execution  data  path  including  adders,  logic  box,  barrel  shifter,  byte  extract  and 
mask,  and  independent  integer  multiplier.  In  addition,  it  contains  a  32  entry  64-bit  integer 
register  file. 

(c)  Floating  point  unit  (FBOX):  The  FBOX  contains  a  fully  pipelined  floating  point 
unit  and  independent  divider,  supporting  both  IEEE  and  VAX  floating  point  data  types. 
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(d)  Load/Store  or  address  unit  (ABOX):  The  ABOX  contains  five  major  sections: 
address  translation  data  path,  load  silo,  write  buffer,  data  cache  (DCACHE)  interface  and 
external  bus  interface  unit. 

The  alpha  chip  uses  a  seven  stage  pipeline  for  integer  operation  and  memory 
reference  instructions,  and  a  six  stage  pipeline  floating  point  operations  instructions.  The 
IBOX  maintains  all  pipelines  stages  to  track  outstanding  register  writes. 

It  also  contains  two  on-chip  caches:  data  cache  (DCACHE)  and  instruction  cache 
(ICACHE).  The  chip  also  supports  secondary  cache,  but  it  is  not  used  in  the  version 
utilized  in  the  T3D.  The  data  cache  contains  8  KB  and  is  a  write  through,  direct  mapped, 
read-allocate  physical  cache  with  32-byte  blocks.  The  data  cache  is  "direct  mapped".  A 
direct  mapped  cache  has  only  one  image  of  a  given  cache  line.  It  is  "read  allocate"  which 
means  that  entries  into  the  cache  only  happen  as  a  result  of  a  cacheable  load  fi’om  local 
memory.  During  a  cache  hit  data  is  loaded  into  register  fi'om  DCACHE  and  during  a 
cache  miss  one  cache  line  is  loaded  from  DRAM. 


The  instruction  cache  is  8  KB  and  is  a  physical  direct-mapped  cache  with  32-b3de 
blocks.  The  Alpha  chip  supports  secondary  cache  built  from  off  the  shelf  static  RAMs 
although  it  is  not  used  in  the  T3D.  The  chip  directly  controls  the  RAM  s  using  its 
programmable  secondary  cache  interface,  allowing  each  implementation  to  make  its  own 
secondary  cache  speed  and  configuration  tradeoffs. 
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Figure  3-2:  Chip  block  diagram 
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3.2.2.  STNGT.E  PR  OPTTMTZATfON 


Optimization  of  a  code  for  a  single  PE  on  the  Cray-T3D  is  more  difficult  than  for  the 
other  Cray  PVP  system  processor  like  C90  because  of  the  following;  optimizations  are 
state  dependent,  data  locality  is  always  the  issue,  bandwidth  is  a  limitation  factor,  not  as 
many  functional  units  are  pipelined,  compilers  and  various  tools  are  not  as  matured  as 
those  available  on  C90.  The  following  are  the  problems  associated  with  the  Alpha  chip 
used  in  Cray-T3D:  (a)  all  memory  operations  stall  upon  cache  miss,  (b)  the  slow  external 
bus  makes  the  DRAM  bandwidth  sub-optimal,  (c)  there  are  no  integer  to  floating  point  or 
SQRT  instructions,  (d)  divide  and  integer  multiply  are  not  pipelined.  A  division  operation 
produces  one  result  every  64  clock  periods  and  integer  multiply  produces  one  result  every 
20  clock  periods. 

Every  DRAM  request  results  in  a  cache  line  load  of  four  64-bit  words  -  one  for  the 
actual  request  and  the  other  three  words  which  are  mapped  to  the  same  cache  line. 
Aligning  data  on  the  same  cache  line  boundary  (word  0  of  any  cache  line)  enhances  the 
performance.  The  cache  alignment  can  be  done  by  using  a  compiler  directive  CDIR$ 
CACHE  ALIGN.  Performance  can  also  be  enhanced  by  scalar  replacement ,  by  holding 
the  value  of  a  temporary  scalar  in  a  register  to  reduce  the  number  of  memory  accesses. 
Cache  utilization  can  also  be  enhanced  by  loop  interchange  so  that  stride  in  the  iimer  loop 
is  one.  Large  stride  in  the  iimer  loop  causes  cache  misses.  The  DRAM  memory  of  the 
alpha  chip  is  interleaved  and  one  should  ensure  page  boundary  alignment.  Page  hit  occurs 
when  either  current  or  previous  references  are  to  the  same  even  or  same  odd  page,  or 
current  and  previous  references  have  different  chip  select  (cs)  bit.  Page  miss  occurs  when 
current  and  previous  references  are  to  the  different  odd  pages.  Page  hits  take  8  clock 
periods  whereas  a  page  miss  takes  22  clock  periods. 

3.2.3.  MACRO  ARCHTTECTURF. 

The  macro  architecture  of  the  Cray-T3D  is  bases  on  3D-torus.  It  will  remain  the  same 
in  all  the  three  phases  of  the  MPP  project.  The  macro  architecture  will  be  stable  from  one 
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generation  to  the  next  in  order  to  preserve  the  applications  development  investment  of  the 
users.  The  3D-torus  is  a  three  dimensional  grid  with  periodic  boundary  conditions.  The 
3D-torus  was  chosen  for  various  reasons:  scaling  properties,  low  latency,  high  bisection 
bandwidth  and  low  contention.  Each  node  in  this  topology  consists  of  two  PEs,  Block 
transfer  engine,  and  support  circuitry  which  includes  Data  Translation  Buffers  (DTB), 
Message  Queue  Control  (MQC),  Data  Prefetch  Queue  (DPQ),  Atomic  Swap  Control 
(ASC),  Barrier  Synchronization  Registers  (BSR),  and  PE  control. 


Figure  3-3:  Node  architecture 

Each  computational  mode  has  two  identical  PEs  which  function  independently  of 
each  other.  Each  node  has  support  circuitry  including  but  not  limited  to  network  interface, 
network  router  and  block  transfer  engine.  The  network  interface  formats  the  information 
and  the  network  router  deformats  it  before  sending  it  to  PEO  or  PEI.  Block  transfer 
engine  (BTE)  is  asynchronous  and  is  shared  by  two  PEs.  It  can  move  data  independently 
without  involving  either  the  local  PE  or  the  remote  PE.  It  also  provides  gather  scatter 
functionality  in  addition  to  data  pre-fetch  with  a  constant  stride.  It  can  transfer  up  to  64  K 
words  and  can  be  used  to  select  PE  number  and  memory  offset  bits  using  the  virtual 
global  memory  address  facility.  The  use  of  BTE  requires  making  system  calls.  It  also 
involves  performing  local  work  first  and  double  buffering  the  remote  data  transfers  and 
working  on  those  buffers,  but  however,  the  start  up  time  for  BTE  is  very  high. 

The  3D-torus  network,  which  is  a  high  interconnection  network,  connecting  the  nodes 
operates  at  150  Mhz,  identical  to  the  clock  of  the  alpha  chip  used  in  the  node.  This  leads 
to  low  latencies  for  communication  between  nodes.  It  uses  "dimensional  order  routing" 
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for  propagation  of  messages.  The  network  channels  are  16  bits  wide  and  can  send 
simultaneously  bi-directional  in  all  three  directions  (X,  Y  and  Z).  The  bandwidth  of  the 
channel  is  300  Mbytes/s.  The  bisection  bandwidth  with  1024  PEs  is  75  GBytes/s.  For 
node  to  node  message  passing,  the  minimum  measured  latency  is  1.3  microseconds.  The 
network  has  hardware  synchronization  primitives  which  lends  to  fast  synchronization  or 
bamers.  The  T3D  network  transmits  system  control  information  and  user  data.  The 
control  packets  vary  in  size  from  6  to  16  bytes.  The  data  packets  range  in  size  from  16 
bjdes  to  52  bytes.  The  amount  of  data  in  these  packets  is  8  or  32  bytes  with  the  remainder 
being  header  and  checksum  information.  The  headers  and  checksums  contribute  to  a  load 
factor  which  affects  attainable  data  transfer  rates  [6]. 


Figure  3-4:  Cray-T3D  topology 


3.3  MPP  PROGRAMMING  MOnFT 

The  MPP  programming  model  [7]  for  the  Cray-T3D  system  supports  several  styles  of 
programming  -  data  parallel,  global  address,  work  sharing  and  message  passing.  These 
styles  may  be  individually  used  or  combined  in  the  same  program.  This  model  allows  the 
user  a  range  of  control  over  the  MPP  hardware.  This  range  extends  from  a  low  level 
control  in  which  the  programmer  makes  almost  all  of  the  decisions  about  how  data  and 
work  are  partitioned  and  distributed,  to  a  high  level  of  control  where  the  programmer 
identifies  where  parallelism  is  located  and  lets  the  system  determine  best  how  to  exploit 
it.  The  important  elements  of  this  programming  model  are  access  and  placement  of  data, 
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parallel  execution,  local  execution,  work  sharing,  synchronization  primitives,  sequential 
I/O,  subroutine  interfaces  and  special  intrinsic  functions. 

DATA  PARALLEL  MODEL 

The  MPP  programming  model  distinguishes  data  objects  into  two  categories  : 

(1)  private  data  (PE_PRIVATE),  that  are  private  to  a  task 

(2)  shared  data  (SHARED),  that  are  shared  among  all  the  tasks. 

Private  data  objects  reside  on  each  PE,  rather  than  spreading  one  copy  over  all  of 
them.  They  are  not  accessible  to  any  other  task.  The  task  that  references  a  private  object 
references  its  own  private  version  of  that  object  and  therefore  it  is  possible  for  private 
data  objects  associated  with  different  PEs  to  have  different  values. 

Shared  data  objects,  on  the  other  hand,  are  accessible  to  all  tasks.  They  are  not 
replicated  and  in  case  of  arrays  be  distributed  across  multiple  Pes. 

In  data  parallel  programming,  data  such  as  scalar  or  array  are  distributed  over  the 
memories  of  the  PEs  working  on  the  program.  In  this  programming  model,  the  goal  is  to 
let  as  many  PEs  as  possible  perform  on  its  own  data  (residing  in  its  memory)  rather  than 
working  on  the  data  that  is  residing  in  another  PE's  memory. 

In  CRAFT,  the  data  distribution  is  indicated  by  the  compiler  directives  PE  PRIVATE 
and  SHARED.  Data  that  are  not  explicitly  declared  to  be  shared  is,  by  default,  private 
data.  Variables  and  arrays  can  be  explicitly  declared  as  private  with  the  PE_PRIVATE 
directive. 

CDIR$  PE  PRIVATE  varl,  var2  ...  vam 
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All  private  data  objects  may  be  DATA  initialized  except  those  that  occur  in  blank 
common,  dummy  arguments,  automatic  arrays,  and  those  whose  size  is  a  function  of 
N$PES  (number  of  tasks). 

The  shared  data  objects  are  declared  with  the  SHARED  directive. 

CDIR$  SHARED  arrayl(distl),array2(dist2)  ...arrayn(distn) 

The  shared  directive  names  the  variables  that  are  to  be  shared  data  objects  and 
specifies  the  distribution  across  the  PEs.  Shared  data  object's  distributions  fall  into  two 
categories  :  shared  scalars  and  dimensional  distribution.  Scalar  variables  are  always 
allocated  on  a  single  PE,  which  may  differ  for  different  processor  elements.  Dimensional 
distribution  includes  the  following:  Cyclic  distribution.  Generalized  distribution.  Block 
distribution  and  Degenerate  distribution. 

Cyclic  distribution  (:BLOCK(l))  assigns  one  element  of  shared  array  to  each  PE, 
returning  to  the  first  PE  when  every  PE  has  an  element.  Generalized  distribution 
(:BLOCK(n))  assigns  blocks  of  'n'  elements  of  the  array  to  successive  PEs,  where  'n'  has 
to  be  an  integer  power  of  2.  The  block  distribution  (:BLOCK)  divides  an  array  dimension 
into  N$PES  blocks  and  allocates  one  block  to  each  PE.  The  block  size  equals  to  array  size 
divided  by  NSPES.  Degenerate  distribution  (:)  forces  an  entire  dimension  to  be  allocated 
on  a  single  PE. 

3.3.2.  WORK  SHARING 

Executing  the  statements  of  program  in  parallel,  and  in  the  same  PEs  in  which  the 
data  is  distributed  achieves  higher  performance  for  the  Cray  MPP  system.  Work  sharing 
is  achieved  primarily  by  two  ways:  automatic  arrays  and  shared  DO  loops. 

Fortran  array  syntax  or  automatic  arrays  is  one  way  to  distribute  work.  A  Fortran 
statement  using  array  syntax  and  involving  shared  arrays  encountered  in  a  parallel  region 
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causes  all  processors  to  execute  the  statement.  The  compiler  maximizes  data  locality  i.e., 
the  work  is  such  distributed  that  tasks  execute  on  its  local  data. 


DIMENSION  Zl(64),  Z2(64),  VNEW(64) 

CDIR$  SHARED  Zl(:BLOCK),  Z2(:BLOCK),  VNEW(:BLOCK) 


VNEW(I)  =  Z1(I)  -  Z2(I) 


DOSHARED  directive  is  the  second  way  of  achieving  work  sharing.  As  loops  do  not 
create  parallelism,  work  sharing  of  DO  loops  is  achieved  by  distributing  iterations  across 
all  available  tasks.  Each  task  is  assigned  a  set  of  iterations  of  a  shared  loop  to  execute. 
Shared  loops  do  not  guarantee  the  order  in  which  iterations  will  be  executed  and  lets  the 
system  execute  iterations  concurrently.  There  is  an  implicit  barrier  synchronization  at  the 
end  of  a  shared  loop.  The  example  for  DOSHARED  directive  from  subroutine 
VOLUME  is  as  follows: 

CDIR$  DO  SHARED  (I)  ON  VNEW(I) 

DO  10,1=  1,LNL1 
VNEW(I)  =  Z1(I)  -  Z2(I) 

10  CONTINUE 

Private  loops  can  be  inside  and  outside  the  shared  loop,  but  the  shared  loop  must  be 
tightly  nested,  the  inner  shared  loop  is  executed  as  a  private  loop.  The  distribution 
mechanism  for  a  shared  loop  affects  program  performance  rather  than  correctness.  Proper 
choice  of  iteration  alignment  provides  higher  degree  of  locality  (when  references  in  the 
iteration  are  close  together).  The  aligned  distribution  mechanism  is  designed  to  place 
iterations  within  tasks  on  PEs  where  the  references  reside. 

A  private  loop  is  executed  only  by  the  task  that  invokes  it  and  no  work  is  shared 
between  tasks.  Private  loops  define  program  behavior  by  defining  the  behavior  of  the 
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individual  tasks.  Private  loops  have  exactly  the  same  semantics  as  loops  in  standard 
Fortran.  No  special  syntax  is  required  to  specify  a  loop  as  private,  as  it  is  the  default. 

3.3.3.  SHARED  TO  PRIVATE  COERCION 

The  cardinal  rule  for  distributed  memory  machines  is  to  exploit  data  locality  i.e,  work 
with  local  data  and  avoid  communication  as  much  as  possible.  Performance  without 
communication  far  exceeds  that  with  communication. 

There  are  two  paradigms  of  CRAFT  with  no  interprocessor  communication.  The 
highest  performance  is  attained  by  shared  to  private  coercion  and  the  next  paradigm  is  the 
PE_RESIDENT  directive. 

In  shared  to  private  coercion,  an  actual  argument  declared  as  shared  array  is  passed  to 
a  corresponding  dummy  argument  declared  as  a  private  array.  This  causes  each  PE  to 
pass  only  its  own  data  to  the  subroutine,  which  leads  to  the  subroutine  accessing  array 
elements  that  are  strictly  local  to  the  executing  PE  as  private  data  without  additional 
overhead. 

The  example  illustrated  here  is  from  the  STRAIN  subroutine: 

PROGRAM  START_STRAIN 

INTEGER  L1,LN,M,LNL1 

REAL  ZIDOT(MXLB),  Z2DOT(MXLB),  HMIN(MXLB),  EZDOT(MXLB), 

2  EXDOT(MXLB),  EXYDOT(MXLB),  EYDOT(MXLB) 

CDIR$  GEOMETRY  GG(:BLOCK(l)) 

CDIR$  SHARED  (GG) ::  ZIDOT,  Z2DOT,  HMIN,  EZDOT,  EXDOT,EXYDOT, 

2  EYDOT 


LNLl  =  (LN-L1+1)/N$PES 
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CALL  STRAIN(L1,  LN,  EXDOT,  EXYDOT,  EYDOT,  EZDOT, 
*  HMIN,  ZIDOT,  Z2DOT,...LNLl) 


END 

SUBROUTINE  STRAIN(L1,  LN,  EXDOT,  EXYDOT,  EYDOT,  EZDOT, 
*  HMIN,  ZIDOT,  Z2DOT,...LNLl) 

C  STRAIN  computes  strain  rates 

REAL  ZIDOT(*),  Z2DOT(*),  HMIN(*),  EXDOT(*),  EZDOT(*), 

2  EXYDOT(*),  EYDOT(*) 


IF  (IGEOM.EQ.l)  THEN 
DO  10,  I=1,LNL1 

EZDOT(I)  =  (ZlDOT(I)-Z2DOT(I))/HMIN(I) 
EXDOT(I)=0.0 


10  CONTINUE 


RETURN 

END 


In  this  example,  the  variables  EXDOT,  EYDOT,  EXYDOT,  EZDOT,  ZIDOT, 
Z2DOT  and  HMIN  are  defined  as  shared  variables  in  the  calling  program,  but  are  defined 
as  private  variables  in  the  called  subroutine  STRAIN.  This  causes  each  PE  to  pass  only 
its  own  data  to  the  subroutine  like  the  first  element  of  these  arrays  are  located  in  PEO,  the 
second  element  of  these  arrays  are  located  in  PEI  and  so  on.  So,  when  DO  loop  is 
executed,  all  the  executable  statements  work  on  local  data.  This,  reduces  the 
communication  time  and  improves  the  performance  of  the  parallel  code. 
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4.  PERFORMANCE  OF  TNDTVTDTJAT.  STJBROTJTTNES 


4.1  PERFORMANCE  MODELING  AND  SEAT.ABTTJTY  ANALYSTS  FOR 
PARAT J.EL  SYSTEM 

A  sequential  algorithm  is  evaluated  in  terms  of  its  execution  time,  expressed  as  a 
function  of  the  size  of  its  input.  The  execution  time  of  a  parallel  algorithm  depends  not 
only  on  input  size  but  also  on  the  architecture  of  the  parallel  computer  and  the  number  of 
processors.  An  parallel  system  is  a  combination  of  an  algorithm  and  the  parallel 
architecture  on  which  it  is  implemented.  The  performance  of  a  parallel  program  takes 
into  account  execution  time,  scalability  of  computational  kernels,  the  mechanisms  with 
which  data  is  generated,  stored,  transmitted  over  networks,  moved  to  and  from  disks,  and 
passed  between  different  stages  of  a  computation.  Metrics  used  to  measure  performance 
include  execution  time,  parallel  efficiency,  memory  requirements,  throughput,  latency, 
input/output  rates,  network  throughput,  design  costs,  implementation  costs,  verification 
costs,  potential  for  reuse,  hardware  requirements,  hardware  costs,  maintenance  costs, 
portability  and  scalability.  The  relative  importance  of  these  diverse  metrics  will  vary  to 
the  nature  of  problem  at  hand.  A  specification  may  provide  hard  mmibers  for  some 
metrics,  require  that  others  be  optimized,  and  ignore  yet  others.  For  example,  the  design 
specification  for  an  operational  weather  forecasting  system  may  specify  maximum 
execution  time  (like,  the  forecast  must  be  complete  within  four  hours),  hardware  costs, 
and  implementation  costs,  and  require  that  the  fidelity  of  the  model  be  maximized  within 
these  constraints.  In  addition,  reliability  is  of  particular  high  importance,  as  may  be 
scalability  to  future  generation  of  computers.  For  a  different  application  like  image 
processing,  one  is  not  concerned  with  total  time  required  to  process  a  certain  number  of 
images  but  rather  with  the  number  of  images  that  can  be  processed  per  second 
(throughput)  or  the  time  that  it  takes  a  single  image  to  pass  through  the  pipeline  (latency). 
Throughput  would  be  important  in  a  video  compression  application,  while  latency  would 
be  important  if  the  program  formed  a  part  of  a  sensor  system  that  must  react  in  real  time 
to  events  detected  in  a  image  stream. 
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For  the  EPIC  hydro  code,  though  the  wall  clock  time  is  important,  the  most  important 
metrics  one  is  concerned  about  is  the  cost  of  computation.  Comparing  cost  of 
computation  between  the  Cray-YMP  and  Cray-T3D,  using  32  processors  on  the  Cray- 
T3D  is  cheaper  than  running  an  application  code  on  single  YMP  processor.  So,  the  goal 
of  parallelizing  the  EPIC  code  is  to  run  it  on  a  more  cost  effective  parallel  system. 


4.2  PERFORMANCE  MRTRIES  IN  PARALLEL  SYSTEMS 

Some  of  the  metrics  that  are  commonly  used  to  measure  the  performance  of  parallel 
systems  are  described  below: 

4.2.1  RIJN  TIME 

The  serial  run  time  of  the  program  is  the  time  elapsed  between  the  beginning  and  the 
end  of  its  execution  on  a  serial  computer.  The  parallel  run  time  of  the  time  that  elapses 
from  the  moment  that  a  parallel  computation  starts  to  the  last  processor  finishes 
execution. 

4.2.2.  SPEEDUP 

It  is  defined  as  the  ratio  of  the  time  taken  to  solve  a  problem  on  a  single  processor  to 
the  time  required  to  solve  the  same  problem  on  a  parallel  computer  with  p  identical 
processors.  It  is  denoted  by  symbol  S.  When  evaluating  a  parallel  system,  one  is 
interested  in  knowing  how  much  performance  gain  is  achieved  by  parallelizing  a  given 
application  over  a  sequential  implementation.  Speedup  is  measure  that  captures  the 
relative  benefit  of  solving  a  problem  in  parallel. 

When  a  serial  computer  is  used,  it  is  natural  to  use  the  sequential  algorithm  which 
solves  the  problem  in  least  amount  of  time.  So,  to  judge  a  parallel  algorithm  fairly,  it  is 
compared  to  the  fastest  sequential  algorithm  on  a  single  processor.  When  the  fastest 
sequential  algorithm  to  solve  a  problem  is  not  known,  or  impractical  to  implement,  the 
fastest  known  and  practical  algorithm  is  chosen  as  the  best  sequential  algorithm.  Then, 
the  performance  of  the  parallel  algorithm  to  solve  the  problem  is  compared  to  the  best 
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sequential  algorithm  to  solve  the  same  problem.  So,  speedup  is  defined  as  the  best 
sequential  algorithm  for  solving  a  problem  to  the  time  taken  by  the  parallel  algorithm  to 
solve  the  problem  on  p  processors.  The  p  processors  used  by  the  parallel  algorithm  are 
assumed  to  be  identical  to  the  one  used  by  the  sequential  algorithm. 

4.2..1  RFFTriENCY 

An  ideal  parallel  system  containing  p  processors  can  deliver  a  speedup  equal  to  p.  In 
practice,  ideal  behavior  is  not  achieved  because  while  executing  a  parallel  algorithm,  the 
processors  cannot  devote  100  percent  of  their  time  to  the  computation  of  the  algorithm. 
Part  of  the  time,  required  by  the  processors  to  solve  the  problem,  is  spent  in 
communication.  Efficiency  is  a  measure  of  the  fi-action  of  time  for  which  a  processor  is 
usefully  employed;  it  is  defined  as  the  ratio  of  speedup  to  the  number  of  processors.  In  an 
ideal  parallel  system,  speedup  is  equal  to  p  and  efficiency  equal  to  one.  In  practice, 
speedup  is  less  than  p  and  efficiency  is  between  zero  and  one,  depending  on  the  degree  of 
effectiveness  with  which  the  processors  are  utilized.  Denoting  efficiency  as  E, 

E=  S/P 


4.2.4  COST 

Cost  of  solving  a  problem  on  a  parallel  system  is  defined  as  the  product  of  run  time 
and  the  number  of  processors  used.  Cost  reflects  the  sum  of  the  time  that  each  processor 
spends  solving  the  problem.  The  cost  of  solving  a  problem  on  a  single  processor  is  the 
execution  time  of  the  fastest  known  sequential  algorithm.  A  parallel  system  is  said  to  be 
cost  optimal  if  the  cost  of  solving  a  problem  on  a  parallel  computer  is  proportional  to  the 
execution  time  of  the  fastest-known  sequential  algorithm  on  a  single  processor.  Since 
efficiency  is  the  ratio  of  sequential  cost  to  parallel  cost,  a  cost  optimal  parallel  system  has 
an  efficiency  in  the  order  of  1 . 

4.3  SCALABILITY  OF  PARALLEL  SYSTEMS  AND  AMDAHT.’S  T  AW 

The  number  of  processors  is  the  upper  bound  on  the  speedup  that  can  be  achieved  by 
a  parallel  system.  Speedup  is  one  for  a  single  processor,  but  if  more  processors  are  used. 
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speedup  is  usually  less  than  the  number  of  processors.  This  phenomenon  is  explained  by 
Amdahl’s  law. 


Amdahl’s  law  states  that  if  the  sequential  component  of  an  algorithm  accounts  for  1/s 
of  the  program’s  execution  time,  then  the  maximum  possible  speedup  that  can  be 
achieved  on  a  parallel  computer  is  s.  This  is  because  every  algorithm  has  a  sequential 
component  that  will  eventually  limit  the  speedup  that  can  be  achieved  on  a  parallel 
computer.  For  example,  if  the  sequential  component  is  5  percent,  then  the  maximum 
speedup  that  can  be  achieved  is  20. 

As  a  consequence  of  Amdahl’s  law,  the  efficiency  drops  with  an  increasing  number  of 
processors.  Secondly,  a  larger  instance  of  the  same  problem  yields  higher  speedup  and 
efficiency  for  the  same  number  of  processors,  although  both  speedup  and  efficiency 
continue  to  drop  with  increasing  p.  These  two  phenomena  are  common  to  a  large  class  of 
parallel  systems. 


Given  that  increasing  the  number  of  processors  reduces  efficiency  and  that  increasing 
the  size  of  the  computation  increases  efficiency,  it  should  be  possible  to  keep  the 
efficiency  fixed  by  increasing  both  the  size  of  the  problem  and  the  number  of  processors 
simultaneously.  For  example,  the  efficiency  of  an  algorithm  of  adding  64  numbers  using 
four  processors  is  0.8.  If  the  number  of  processors  is  increased  to  8  and  the  size  of  the 
problem  is  scaled  up  to  add  192  numbers,  the  efficiency  remains  0.8.  This  ability  to 
maintain  efficiency  at  a  fixed  value  by  simultaneously  increasing  the  number  of 
processors  and  the  size  of  the  problem  is  called  scalability  of  parallel  system.  The 
scalability  of  a  parallel  system  is  a  measure  of  its  capacity  to  increase  speedup  in 
proportion  to  the  number  of  processors.  It  reflects  a  parallel  system’s  ability  to  utilize 
increasing  processing  resources  effectively.  The  scalability  and  cost-optimality  of  parallel 
systems  are  related.  A  scalable  parallel  system  can  be  made  cost-optimal  if  the  number  of 
processors  and  the  size  of  the  computation  are  chosen  appropriately.  A  good  scalable 
system  is  one  whose  efficiency  doesn’t  decrease  with  increase  in  the  problem  size. 


24-31 


4.4  TSORFFICTENCY  OF  PARALLEL  SYSTEMS 


It  is  useful  to  determine  the  rate  at  which  the  problem  size  must  increase  with  respect 
to  the  number  of  processors  to  keep  the  efficiency  fixed.  For,  different  parallel  systems 
the  problem  size  must  increase  at  different  rates  in  order  to  maintain  fixed  efficiency  as 
the  number  of  processors  is  increased.  This  rate  determines  the  degree  of  scalability  of  the 
parallel  system. 

For  scalable  parallel  systems,  efficiency  can  be  maintained  at  a  fixed  value  (between 
0  to  1) 

E  =  1  /  (1  +  T(W,p)  /  W) 

where  E  is  efficiency,  T  is  overhead  function,  W  is  problem  size  and  p  is  the  number  of 
processors. 

Then,  the  equation  becomes, 

W  =  KT(W,p) 

where  K  =  E/(l-E)  is  a  constant. 

From  the  above  equation,  the  problem  size  W  can  be  obtained  as  a  function  of  p.  This 
function  dictates  the  growth  rate  of  W  required  to  keep  the  efficiency  fixed  as  p  increases. 
This  function  is  known  as  the  isoefficiency  function  of  the  parallel  system.  The 
isoefficiency  function  determines  the  ease  with  which  a  parallel  system  can  maintain  a 
constant  efficiency  and  hence  achieve  speedups  increasing  in  proportion  to  the  number  of 
processors.  A  small  isoefficiency  function  means  that  small  increments  in  problem  size 
are  sufficient  for  the  efficient  utilization  of  an  increasing  number  of  processors  indicating 
that  the  parallel  system  is  highly  scalable.  A  large  isoefficiency  function  indicates  a  poor 
scalable  parallel  system. 
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In  E  single  expression,  the  isoefficiency  function  captures  the  characteristics  of  a 
parallel  algorithm  as  well  as  parallel  architecture  on  which  it  is  implemented.  After 
performing  the  isoefficiency  analysis,  one  can  test  the  performance  of  a  parallel  program 
on  a  few  processors  and  then  predict  its  performance  on  a  larger  number  of  processors. 
However,  the  utility  of  isoefficiency  function  is  not  limiting  to  predicting  the  impact  on 
performance  of  an  increasing  number  of  processors. 

4.5  STGNTFTCANCE  OF  GRAPHOCAL  PLOTS 

Programmers  use  parallelism  to  make  their  programs  solve  a  single  problem  in  less 
time  or  solve  larger  problems  in  a  fixed  time.  Substantial  amount  of  effort  is  spent  in 
presenting  the  performance  in  the  best  possible  way.  So,  it  is  very  important  to 
understand  the  significance  of  each  graphical  plot  in  terms  of  performance  of  the  parallel 
code.  In  the  present  work,  the  following  graphical  plots  were  highlighted  in  the 
performance  of  individual  subroutines. 

•  Number  of  Processors  v/s  Wall  Clock  Time:  In  this  plot,  the  wall  clock  time  of  the 
code  using  multiple  processors  on  T3D  and  that  of  YMP  is  compared.  If  the  wall 
clock  time  of  32  processors  on  T3D  is  less  than  the  YMP  wall  clock  time,  the  code  or 
part  of  the  code  is  cost  effective. 

•  Number  of  Processors  v/s  Speedup:  In  this  plot,  the  actual  speedup  using  the 
particular  number  of  processors  is  compared  with  the  ideal  speedup  (no 
communication  assumed!).  It  gives  an  idea  of  the  amount  of  time  is  spent  in 
communication  rather  than  computation  as  the  processors  increase. 

•  Number  of  Processors  v/s  Efficiency:  In  this  plot,  the  efficiency  of  the  algorithm  is 
seen.  It  gives  the  idea  of  the  ideal  number  of  processors  needed  to  run  the  code. 

4.6  PARAT  J  FT  JZATTON  OF  INDIVIDUAL  SUBROUTINES 
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All  the  subroutines  discussed  in  the  thesis  were  handled  in  a  particular  manner  as 
discussed  in  paralleling  VOLUME.  Data  parallelism,  Work  sharing  and  Shared  to  Private 
coercion  were  the  techniques  used  for  parallelizing  individual  subroutines. 

4.6.1.  VOT  JIMF. 

The  VOLUME  subroutine  computes  volumes,  volumetric  strains  and  strain  rates.  The 
VOLUME  subroutine  is  called  by  the  subroutine  ELOOP.  To  simulate  problem  for  the 
subroutine  the  variables  and  arrays  are  data  initialized  in  the  calling  routine,  in  this  case 
subroutine  ELOOP.  Among  such  variables  are  LI,  LN,... XI (I).. .etc.  As  the  subroutine  in 
the  EPIC  program  have  common  include  files,  the  array  size  is  made  powers  of  2  for 
compiling  on  the  Cray-T3D. 

Firstly,  the  data  in  the  subroutine  is  shared  using  different  distribution  schemes  like 
(:BLOCK(l))  and  (:BLOCK).  Work  sharing  is  implemented  explicitly  by  DOSHARED 
directives  in  the  DO  loop.  A  variable  LNLl  is  defined  which  is  equal  to  LN-Ll+1  and  the 
original  vectorized  DO  loop  index  J  is  eliminated. 

The  Cray-T3D  being  a  dedicated  machine,  the  real  time  clock  function  is  used  to 
measure  the  wall  clock  time.  For,  better  results,  the  number  of  times  a  subroutine  is  called 
is  increased,  so  that  the  code  accumulates  some  execution  time. 

Using  :BLOCK(l)  and  :BLOCK  data  distribution  and  work  sharing  directives  gave 
the  results  tabulated  in  table  4-1. 


Table  4-1:  Comparison  of  :BLOCK  and  :BLOCK(l)  data  distribution  directives 


Number  of  Processors 

:BLOCK(l) 

■.BLOCK 

1 

2.794 

2.543 

2 

1.121 

1.112 

4 

0.923 

0.883 

24-34 


8 


0.475 


0.462 


To  further  improve  the  performance  of  the  subroutine  the  shared  to  private  coercion 
technique  was  implemented.  In  this  technique,  shared  arrays  in  the  calling  routine  are 
passed  to  corresponding  dummy  arguments.  Like  xl  is  declared  in  ELOOP,  but  declared 
private  in  VOLUME.  So,  in  the  subroutine  VOLUME  each  PE  has  xl  in  blocks  of  array- 
size  divided  by  the  number  of  PEs  (array-size/n$pes  in  this  case  LNLl/n$pes).  Due  to 
this  DO  loops  in  the  subroutine  VOLUME  also  indexes  upto  LNLl/n$pes,  so  that  each 
PE  work  on  its  local  data  and  no  interprocessor  communication  takes  place.  The  shared  to 
private  coercion  vastly  improves  the  performance  of  the  subroutine. 


Table  4-2:  Wall  clock  timing  of  subroutine  VOLUME 


Number  of  Processors 

Wall  clock  time 

YMP 

0.763 

1 

2.298 

2 

0.900 

4 

0.513 

8 

0.281 

16 

0.183 

32 

0.136 

64 

0.118 

128 

0.108 

In  the  subroutine  VOLUME,  the  massively  parallel  code  on  4  PEs  runs  faster  than  the 
vectorized  code.  This  subroutine’s  MPP  implementation  is  very  cost  effective.  It  is  also 
observed  that  the  efficiency  of  the  MPP  code  is  quite  high  with  32  PEs,  which  implies 
good  scalable  subroutine. 
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The  following  figures  give  an  effective  representation  of  parallel  performance  of  the 
subroutine  VOLUME. 


Figure  4-1:  VOLUME  :  Number  of  processors  v/s  Wall  clock  time 


Figure  4-2:  VOLUME:  Number  of  Processors  v/s  Speedup 
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Figure  4-3:  VOLUME:  Number  of  Processors  v/s  Efficiency 


4.6.2.  EGET 

The  EGET  subroutine  is  used  to  initialize  element  variables  from  nodal  variables.  It  is 
also  called  by  subroutine  ELOOP.  The  variables  and  arrays  are  data  initialized  in  the 
calling  subroutine. 

To  further  improve  performance,  the  shared  to  private  coercion  was  implemented.  As 
in  the  CRAFT  program  in  the  DOSHARED  loop  contains  J  =  LI  + 1  - 1,  shared  to  private 
coercion  cannot  be  implemented  as  it  is  because  it  contains  I  on  the  right  hand  side  of  the 
assignment  statement.  So,  a  variable  II(I)  =  I  was  defined  in  the  calling  routine  and  shared 
to  private  coercion  was  implemented. 

The  performance  gain  by  implementing  the  shared  to  private  coercion  principles  can 
be  seen  in  the  following  table.  Not  only  their  is  better  performance  compared  to  the 
corresponding  CRAFT  MPP  code,  but  also  is  it  more  scalable. 

In  the  EGET  subroutine,  the  MPP  code  gives  better  performance  than  the  YMP  code 
when  run  on  32  PEs.  So,  this  subroutine  is  also  cost  effective,  but  not  as  much  as 
subroutine  VOLUME.  As  the  number  of  control  statements  in  the  subroutine  EGET  is 
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more  than  subroutine  VOLUME,  the  efficiency  of  parallel  EGET  is  less.  Also,  its 
efficiency  is  not  as  high  as  VOLUME  when  run  on  32  PEs. 


Table  4-3:  EGET:  Number  of  Processors  v/s  Wall  clock  time 


Number  of  Processors 

Wall  clock  time 

YMP 

0.257 

1 

2.368 

2 

1.276 

4 

0.735 

8 

0.454 

16 

0.314 

32 


0.244 


g  Actual 
I  Ideal 


Number  of  Processors 


Figure  4-5:  EGET:  Number  of  Processors  v/s  Speedup 


Figure  4-6:  EGET:  Number  of  Processors  v/s  Efficiency 


4.6.3.  GMCQN 

GMCON  computes  geometric  constants  for  elements  in  the  common  core 
positions.  It  was  parallelized  by  shared  to  private  coercion  technique.  The  data  parallel 
scheme  implemented  was  :BLOCK.  It  is  seen  that  the  wall  clock  time  of  1 6  processors  in 
T3D  is  less  than  that  of  YMP,  therefore  a  cost  effective  routine  to  parallelize. 
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The  parallel  GMCON  running  on  16  PEs  was  faster  than  the  YMP  vectorized  code. 
The  efficiency  curve  showed  that  the  efficiency  of  the  parallel  code  was  good(>0.6)  for 
32  PEs  and  thus  could  be  called  a  scalable  subroutine. 


Table  4-4:  GMCON:  Number  of  Processors  v/s  Wall  clock  time 


Number  of  Processors 

Wall  clock  time 

YMP 

0.0163 

1 

0.1836 

2 

0.0788 

4 

0.4060 

8 

0.0223 

16 

0.0133 

32 

0.0090 

Figure  4-7:  GMCON:  Number  of  Processors  v/s  Wall  clock  time 
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SUPRAMOLECULAR  MULTILAYER  ASSEMBLIES  WITH 
PERIODICITIES  IN  A  SUBMICRON  RANGE 
(A  step  toward  smart  optical  filters) 


VLADIMIR  V.  TSUKRUK 

MATERIALS  PROGRAM,  WESTERN  MICHIGAN  UNIVERSITY, 
KALAMAZOO,  MI  49008 


Abstract. 


Organized  ultrathin  films  are  the  subject  of  great  and  growing  interest  because  of  their 
compatibility  with  future  nano-scale  technologies.  We  use  the  supramolecular  engineering 
approach  mampulating  a  single  building  unit,  a  supramolecular  assembly,  with  chemically 
prC'determined  nature  of  functionality  and  dimensions  to  build  organized  films  with  large- 
scale  periodicity  of  multilayer  structures.  Supramolecular  self-assembled  films  are  fabricated 
by  electrostatic  layer-by-layer  deposition  and  electrostatic  deposition  assisted  by  dip-coating 
and  spin-coating.  We  use  three  very  different  classes  of  charged  polymeric  materials: 
amorphous  coiled  polyions,  dendritic  macromolecules,  and  polymer  latex  nanoparticles. 
Self-assembled  films  are  studied  by  scanning  probe  microscopy.  X-ray  reflectivity, 
ellipsometry,  and  contact-angle  measurements.  We  demonstrate  that  replacing  unstructured 
coiled  macromolecular  chains  with  organized  dendritic  supramolecules  or  “bulk” 
nanoparticles  results  in  an  increase  in  the  growth  increment  and  internal  periodicity  by  an 
order  of  magnitude  higher  than  for  conventional  amorphous  films.  In-plane  ordering  can  be 
controlled  by  deposition  time,  ionization  state,  and  application  of  capillary  or  shearing  forces. 
The  routine  proposed  may  be  used  for  formation  of  supramolecular  films  with  mesoscale 
periodicity  and  intriguing  optical  properties.  We  present  the  first  tnilv  macroscnpicallv 
ordered  monolayer  of  charged  latex  nanoparticles  obtained  by  force  assisted  electrostatic 
deposition  with  lateral  sizes  extended  to  a  fraction  of  a  milimeter. 
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SUPRAMOLECULAR  MULTILAYER  ASSEMBLIES  WITH 
PERIODICITIES  IN  A  SUBMICRON  RANGE 
(A  step  toward  smart  optical  filters) 


VLADIMIR  V.  TSUKRUK 

MATERIALS  PROGRAM,  WESTERN  MICHIGAN  UNIVERSITY, 
KALAMAZOO,  MI  49008 


Introduction. 


Organized  ultrathin  films  are  the  subject  of  great  and  growing  interest  because  of  their 
compatibihty  with  future  nano-scale  technologies.  Some  examples  are  integrated  optical 

coatings  with  a  gradient  of  refractive  indices  along  the  normal  to  the  surface  plane  *  and 

2 

microelectromechanical  systems  modified  by  boundary  lubricants  on  the  molecular  scale. 
Research  in  these  fields  focuses  on  supramolecular  functional  polymeric  materials  with 
suitable  physical  properties  (e.  g.,  non-linear,  photochromic,  or  sensing)  and  abilities  to  form 
organized  superstructures  at  the  interfaces.  The  search  for  new  materials  with  a  suitable 
combination  of  properties,  microstmcture,  and  intermolecular  interactions  is  under  the  way. 

In  our  studies,  we  use  the  supramolecular  engineering  approach  to  manipulate  with  a 
single  building  unit,  a  supramolecular  assembly,  with  chemically  pre-deterrmned  nature  of 
functionality  and  dimensions  to  build  organized  films  with  controllable,  large-scale 
periodicity  of  multilayer  structures  (see  Scheme  1  for  general  comparison).  ®  Modulation  of 
internal  stmctural  periodicity  of  polymer  films  at  a  submicron  scale,  which  results  in 
accompanying  variation  of  the  refractive  index,  is  an  interesting  route  toward  the  next 
generation  of  optical  reflective  filters. 

Currently,  several  approaches  exist  for  fabrication  of  organized  molecular  assemblies 
from  functional  macromolecular  materials.  Electrostatic  layer-by-layer  deposition  from  dilute 
solutions  which  exploits  Coulombic  interactions  between  oppositely  charged  molecules  has 
become  a  widely  used  method  since  1991.  ’  It  has  been  shown  that  this  approach  can  be 
used  to  build  multilayer  films  (hundreds  of  layers)  with  various  combinations  of  molecular 
fragments,  organic  and  inorganic  layers,  latexes,  molecules  with  switchable  conformation, 

biomolecules,  photochromic  molecules,  and  conductive  polymers. 
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0.5  nm 


conventional  multilayer  film 


Scheme  1 .  Possible  architectures  of  supramolecular  films. 


The  layer-by-layer  self-assembling  process  of  amorphous  polyions  in  its  initial  stage  requires 
special  attention.  The  mechanical  and  temporal  stability  of  the  first  molecular  layers  tethered 
to  a  solid  substrate  is  a  critical  element  to  the  homogeneity  of  thicker  films.  A  gradient  of 
molecular  ordering  across  the  first  several  molecular  layers  is  observed  for  various  molecular 
films.  This  phenomenon  is  usually  related  to  healing  of  substrate  inhomogeneities  and 
non-equilibrium  behavior.  Formation  of  non-equilibrium  surface  morphologies  and 
inhomogeneous  coverage  is  caused  by  the  competition  of  the  kinetics  of  polymer  chain 
adsorption  and  their  surface  diffusion.  It  is  speculated  that  assembly  of  polyions  on  charged 
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surfaces  is  indeed  a  two  stage  process;  macromolecular  chains  are  anchored  to  the  surface  by 
some  segments  during  the  short  initial  stage  and  then  relax  to  a  dense  packing  during  the  long 
second  stage  of  self-assembly. 

In  the  present  report,  we  summarize  our  observations  on  the  formation  of  the 
supramolecular  self-assembled  films  by  electrostatic  deposition  of  three  very  different  classes 
of  polymeric  materials:  amorphous  coiled  polyions,  dendritic  macromolecules,  and  polymer 
latex  nanoparticles  (Figure  1).  We  make  prehminary  conclusions  on  feasibility  of 
supramolecular  engineering  approach  for  building  organized  polymer  films  with  mesoscale 
periodicity. 


Figure  1 .  Electrostatic  adsorption  of  charged  polymer  coils,  dendrimers,  and  nanoparticles 
on  oppositevly  charged  surfaces. 


Experimental 

Negatively-positively  charged  pairs  of  polystyrene  sulfonate  (PSS)  and  polyallylamine 
(PAA)  (Figure  2)  ’  PS  latexes  of  20  -  200  nm  in  diameter  with  amino  and  carboxy  surface 
groups  (Table  1)  ",  and  polyamidoamine  dendrimers  with  surface  amine  groups  for  even 
generations  and  carboxylic  groups  for  odd  generations  (G3.5  -  GIO)  (Figure  3)  "  were 
selected  for  this  study. 
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Figure  2.  Self-assembled  films  and  PSS/PAA  formulas. 


Table  1,  Latex  characteristics  and  notations 


Latex 

Mean 
Particle 
Diameter 
(D),  nm 

Standard 
Deviation 
of  D,  % 

Surface 
Charge  Type 
and  Density 

Type  of 
the  Surface 

Notation 

Amidine- 
modified  Pi 

20 

23.3 

positive 
1.7  p.C/cm^ 

hydrophobic 

AL20 

Carboxyl- 
modified  Pi 

20 

15.3 

negative 
17.7  )iC/cm^ 

hydrophilic 

CML20 

Sulfate  PS 

40 

13.9 

negative 
2.5  p.C/cm^ 

hydrophobic 

SL40 

Amidine- 
modified  Pi 

190 

1.5 

positive 
8.3  |iC/cm^ 

hydrophobic 

AL190 

Carboxyl- 
modified  Pi 

190 

2.7 

negative 
221  p-C/cm^ 

hydrophilic 

CML190 

Sulfate  PS 

200 

4.6 

positive 

1.1  p.C/cm^ 

hydrophobic 

SL200 
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Surface  Group  (Z) 


(CH2)2-  CO-  NH-(CH2)2-  n,  J 
(CH2)2"  CO-  NH-(CH2)2-  "'Th 

/  (CH2)2-  CO-  NH-(CH2)2"  n. 

N  ' 

^  N-  (CH2)2“NL 

\  (CH2)2-  CO-  NH-(CH2)2'  n. 

(CH2)2-  CO-  NH-  (CH2)2-  S 

V. 

- y -  (CH2)2-  CO-  NH-  (CH2)2‘  n. 

Repeated  Unit  (RU)  ** 


H 

.H 


Figure  3.  Chemical  structure  of  dendrimers  and  a  scheme  of  G4  dendrimer. 


The  positively  charged  surface  is  an  amine  terminated  self-assembled  monolayer 
(SAM)  and  the  negatively  charge  substrate  is  a  silicon  oxide  layer  of  a  sihcon  wafer.  An 
electrostatic  layer-by-layer  deposition  technique  was  employed  for  the  formation  of  the  films 
from  aqueous  solution  at  neutral  conditions  (pH  =6.5).  At  this  pH  both  polyions  possess 
some  net  charge  as  a  result  of  dissociation  (PSS)  and  protonization  (PAA)  of  side  groups. 
Solid  substrates  used  were  sihcon  wafers  ((100)  orientation,  SAS)  and  float  glass  (Aldrich) 
modified  by  3-aminopropyl  triethoxysilane.  Cleaning  and  modification  of  the  substrates  as 
weU  as  formation  and  transfer  of  the  monolayers  onto  the  solid  supports  were  performed 
using  rigorous  procedures.  ^  All  PSS/PAA  films  were  prepared  in  a  class  100  clean  room 
and  stored  in  a  sealed  containers. 
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Silanized  substrates  were  protonated  in  a  silicon  wafer  holder  with  a  water  solution  of 
0.01  N  HCl.  The  0.01  N  HCl  was  poured  into  the  holder  and  allowed  to  react  for  2  min. 
The  HCl  was  then  poured  out  and  the  substrates  in  the  holder  were  flushed  under  running 
Milh-Q  water.  The  wafer  pieces  were  then  rinsed  with  Milh-Q  water  individually  and  dried 
with  dry  nitrogen.  The  concentrations  of  the  aqueous  PSS  and  PAA  solutions  were  2  mg/ml. 
The  protonated  substrates  were  dipped  in  the  PSS  solution  for  the  appropriate  amount  of  time. 
as  designated  below  (from  1  second  to  64  minutes).  A  set  of  substrates  were  dipped  in  the 
PSS  solution  for  64  minutes  and  were  used  for  the  dipping  in  the  PAA  solution  for  different 
times  as  designated  below.  This  procedure  for  sample  fabrication  described  in  detail  for 
PSS/PAA  films  was  basically  unchanged  for  other  types  of  self-assembling  films  studied 
here. 


Atomic  force  (AFM)  and  friction  force  (FFM)  images  of  fabricated  films  in  contact 
and  non-contact  (the  "tapping")  modes  were  obtained  in  air  at  ambient  temperature  with  the 
Nanoscope  niA  -  Dimension  3000  (Digital  Instruments,  Inc.)  according  to  well-established 
procedures  (Figure  4).  We  observed  that  a  combination  of  the  tapping  mode  or  contact 
mode  scanning  with  tip  modification  allowed  stable  reproducible  imaging  of  the  soft 
monolayers  without  visible  damage.  Images  were  obtained  on  scales  from  200  nm  to  100 
pm  but  for  further  analysis  we  selected  two  most  appropriate  “standard”  sizes  of  2  x  2  pm 
and  5x5  pm.  All  microroughness  values  reported  here  were  measured  for  1  pm  x  1  pm 
areas. 


The  AFM  tips  were  chemically  modified  to  introduce  appropriate  surface  charge  and 
avoid  tip  contamination  by  charged  macromolecules  from  the  specimens. '  ^  Tip  radii  were  in 
the  range  20  -  40  nm  as  estimated  by  scanning  a  standard  specimen  with  tethered  colloidal 
gold  nanoparticles  with  known  diameters  according  to  the  published  procedure.  AFM 
images  were  obtained  for  several  specimens  prepared  under  identical  conditions  at  different 
periods  of  time,  and  at  several  randomly  selected  film  areas.  All  structural  parameters 
discussed  below  were  averaged  over  7  -  10  independent  measurements  after  image 
processing. 

X-ray  reflectivity  measurements  were  performed  on  a  Siemens  D-5000  diffractometer 
equipped  with  a  reflectometry  stage.  X-ray  data  were  collected  within  0  -  7°  scattering  angle 
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Figure  4.  General  scheme  of  AFM  technique. 


range  with  step  of  0.02°  using  monochromatized  CuK^,  radiation.  Measurements  within  two 

angle  intervals  with  different  parameters  (X-ray  tube  power  and  accumulation  time)  were 
rescaled  to  one  angle  interval.  Simulation  of  X-ray  curves  was  done  by  the  REFSIM  1.0 
program  by  direct  computation  of  Fresnel  reflectivity.  We  used  standard  database  densities 
and  refractive  indices  for  the  substrates.  Polymer  film  refractive  indices  were  determined 
from  database  data  for  chemical  elements  in  accordance  with  their  chemical  composition.  For 
X-ray  reflectivity  simulations  we  used  a  double-layer  model  of  surface  structure  of  sihcon 
substrates  (silicon-silicon  dioxide  layers)  with  parameters  determined  independently  for  bare 
substrates.  For  polymer  monolayers,  we  accepted  homogeneous  density  distribution  along 
the  surface  normal  within  a  single  molecular  layer  with  Gaussian  interfacial  zones.  Fitting 
parameters  for  polymer  films  were  thickness,  specific  gravity,  and  roughness  of  polymer 
films. 

All  technical  details  of  experimental  procedures  can  be  found  in  original  publications. 
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Results  and  discussion 


Polyion  monolayer  formation. 

Formation  of  self-assembled  monolayers  is  monitored  for  PSS  adsorbed  on  charged 
SAM  surfaces  and  PAA  on  a  PSS  monolayer.  In  both  cases,  polyions  are  adsorbed  on 
oppositely  charged  surfaces.  Observations  of  PSS  monolayers  at  various  stages  of 
electrostatic  deposition  reveal  inhomogeneous  self-assembly  at  the  earliest  stages  of 
deposition.  During  the  first  several  minutes,  negatively-charged  PSS  macromolecules  tend  to 
adsorb  on  selected  defect  sites  of  positively  charged  SAM  (scratches,  microparticles,  and 
edges)  and  form  islands  composed  of  PSS  coils  (Figure  5a).  At  this  stage,  electrostatic 
adsorption  of  PSS  chains  is  predominant  and  equilibration  of  the  surface  stmcture  is  not 
achieved  by  the  slow  surface  diffusion  mechanism.  Temporal  variation  of  stmctural 
parameters  of  monolayers  and  bilayers  are  collected  in  Table  2. 


Table  2.  Structural  parameters  of  adsorbed  PSS  monolayers  and  PAA/PSS  bilayer. 


Time,  sec 

height,  PSS  ‘ 

height,  PSS  ^ 

height,  PAA  ‘ 

rms,  PSS ' 

rms,  PAA' 

0 

0 

0.0 

0 

0.23 

0.33 

10 

1.5 

1.3 

0.17 

20 

1.2 

2.04 

45 

4.0 

3.5 

1.5 

0.17 

0.42 

120 

1.5 

2.4 

1.4 

0.74 

300 

1.6 

1.7 

0.9 

0.64 

0.32 

600 

1.0 

1.6 

1.0 

0.16 

0.23 

1800 

2.1 

1.2 

0.24 

2100 

0.7 

0.24 

3000 

1.5 

1.2 

3840 

1.0 

2.0 

0.9 

0.26 

0.22 

AH  data  are  in  nm; '  data  from  AFM  measurements,  ^  data  from  ellipsometry;  X-ray  thickness  for 
complete  PSS  monolayer  and  complete  PAA/PSS  bilayer  is  1.5  nm  and  2.6  nm,  respectively. 
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Figure  5.  Topographical  images  of  surface  morphology  and  models  of  monolayer  packing 
for  amorphous  polyions  (a,  b),  dendrimers  (c,  d),  and  latex  nanoparticles  (e,  f)  at  early 
stage  (a,  c,  e)  and  in  dense  state  (b,  d,  f). 
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Only  longer  deposition  times  result  in  an  equilibration  of  polymer  layers  and 
formation  of  homogeneous  thin  PSS  layer  composed  of  highly  flattened  macromolecular 
chains  (Figure  5b).  The  monolayer  thickness  is  between  1.0  -  1.5  run  with  a 
microroughness  of  about  0.2  nm.  Therefore,  the  chains  form  a  very  thin  molecular  layer 
with  a  thickness  of  not  more  than  2-3  molecular  cross-sections.  Self-assembly  of  a  second 
PAA  layer  on  top  of  a  PSS  monolayer  follows  similar  tendencies  resulting  in  the  formation  of 
homogeneous  PAA/PSS  bilayers  with  an  overall  thickness  of  1.7  -  2.5  nm.  This  bilayer  is 
stable  and  carmot  be  damaged  by  the  AFM  tip. 

Dendritic  monolayers 

At  the  initial  stages  of  formation,  isolated  islands  and  network  microstructures  are 
detected  for  various  Starburst  dendrimers  (Figure  3).  All  even  generations  of  dendrimers 
are  observed  to  form  homogeneous,  compact  monolayers  on  a  silicon  surface  (Figure  5c, 
5d).  X-ray  reflectivity  (Figure  6  )  allows  independent  measurements  of  the  average  thickness 
of  dendritic  monolayers.  As  observed,  the  thickness  of  a  single  monolayer  depends  upon 
generation  increasing  with  molecular  weight:  1.8  nm  (G4),  2.8  nm  (G6),  and  5.6  nm  (GIO) 
(Figure  7). 

The  average  thickness  of  a  molecular  layer  is  much  smaller  than  the  diameter  of  ideal 
spherical  dendritic  macromolecules.  The  model  of  molecular  ordering  of  dendrimer  films 
assumes  compressed  dendritic  macromolecules  of  oblate  shape  with  an  axial  ratio  in  the  range 
of  1  :  3  to  1  :  5  depending  upon  generation  (Figure  5d).  A  tendency  to  higher  spreading  of 
high  generation  dendrimers  observed  here  corresponds  to  the  surface  behavior  predicted  by 
molecular  dynamic  simulations.  The  high  interaction  strength  between  “sticky”  surface 
groups  along  with  short  range  Van  der  Waals  forces  and  long  range  capillary  forces  are 
considered  to  be  responsible  for  formation  of  compact  monolayer  structures.  Strong 
interactions  between  oppositely  charged  groups  of  dendritic  macromolecules  from  adjacent 
molecular  layers  (such  as  ionic  binding  and  formation  of  multiplets)  may  be  responsible  for 
compression  of  the  soft  architecture  of  dendritic  macromolecules  within  self-assembled  films 
as  proposed  by  molecular  computer  modeling. 

Latex  monolayers. 

We  observed  various  stages  of  formation  of  a  latex  monolayer  on  oppositely  charged 
substrates.  Incomplete  monolayers  represented  by  clusters  composed  of  tens-hundreds 
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Reflectivity,  arb.  units 


Incident  angle,  degrees 

Figure  6.  X-ray  reflectivity  experimental  data  (thick  lines)  and  simulated  curves  (thin  lines) 
for  G6  (a)  and  GIO  (b)  monolayer  films.  Simulations  curved  were  obtained  for 
film  thicknesses  of  2.8  and  5.6  nm,  roughness  of  1.2  and  1.8  nm,  and  specific 
gravities  of  1.2  and  1.3  g/cm^  for  G6  and  GIO  dendrimers,  respectively. 
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Figure  7.  Spatial  dimensions  of  dendrimers 
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particles  were  formed  for  unassisted  electrostatic  adsorption  (Figure  5e).  These  monolayers 
are  composed  of  clusters  of  several  dense  packed  latex  nanoparticles  randomly  distributed 
over  the  surface.  Maximum  surface  coverage  can  reach  70%  for  20-nm  latex  nanoparticles. 

Most  of  latex  nanoparticles  are  able  to  form  monolayers  with  short-range  local 
ordering.  These  monolayers  possess  liquid  type  lateral  packing  of  the  nanospheres  with  a 
positional  correlation  expanded  only  over  the  nearest  neighbors  (Figure  8).  Fourier  analysis 
shows  weak  diffuse  halo  which  corresponds  to  hexagonal  packing  and  short-range  ordering 
expanded  over  4  -  5  coordination  spheres  (Figure  8).  Larger  latex  nanoparticles  with  a 
narrow  size  distribution  can  form  2D  lattices  with  long-range  ordering  within  monolayers.  '  * 
Obviously,  strong  tethering  of  charged  nanoparticles  to  surfaces  prevents  their  surface 
diffusion  and  rearrangements  required  for  the  formation  of  perfect  lateral  ordering. 
Formation  of  smooth  monolayers  composed  of  melted  material  is  observed  by  thermal 
treatment  at  high  temperature. 

To  overcome  the  strong  repulsive  forces  among  charged  particles  which  prevents 
them  from  forming  complete  monolayers,  we  tested  force  assisted  electrostatic  self-assembly 
approach.  We  used  additional  capillary  forces  within  the  meniscus  to  form  ordered 
monolayers  during  slow  controlled  pulling  out  of  solution  (modified  dip-coating)  and  shear 
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Figure  8.  Fourier-transfonnation  of  latex  monolayer  with  short-range  in-plane  ordering 
(insert). 


25-15 


Spectrum  2D 


0.078  JJM 


9.00 


0.078  JJM 


.90 


Figure  9.  Fourier-transformation  of  latex  monolayer  with  long-range  in-plane  ordering 
(insert). 
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flow  resulting  from  spin-coating.  We  observed  that  force  assisted  self-assembly  of  the 
same  latexes  can  produce  dense  monolayer  films  with  long-range  in-plane  ordering  (Figure 
5f).  Lateral  sizes  of  very  well  ordered  monolayers  are  in  the  range  of  tens  -  hundreds  of 
micrometers  and  usually  six-fold  symmetry  of  in-plane  ordering  is  revealed  by  Fourier- 
analysis  of  monolayer  films  (Figure  9). 

Multilayer  films  from  different  polymeric  materials 

Multilayer  PSS/PAA  films  were  fabricated  and  extensively  studied  earlier. 
Virtually  linear  growth  of  film  thickness  versus  number  of  deposition  cycles  was  observed 
with  an  average  increment  of  multilayer  growth  close  to  1  nm  (Figure  10).  This  value  is 
close  to  monolayer  thickness  and  similar  to  other  amorphous  polyions  (see  data  for 
PVC/PAA  film  in  Figure  10). 

Layer-by-layer  deposition  of  oppositely  charged  dendrimers  in  combination 
results  in  the  formation  of  films  with  homogeneous  surface  morphology  for  a  limited  number 
of  layers.  A  variation  of  the  multilayer  film  thickness  with  the  number  of  deposited  layers,  d 
(x),  is  close  to  a  linear  with  increment  per  layer  in  the  range  of  2.8  ±  0.3  nm  for  G4/3.5  films 
and  3.8  ±  0.6  nm  for  GlO/9.5  film  (see  data  for  G4/3.5  in  Figure  10).  The  small  increment 
of  the  film  thickness  per  molecular  layer  for  multilayer  films  indicates  that  the  dendritic 
macromolecules  are  indeed  very  soft  and  do  not  preserve  their  shape  in  the  condensed  state  at 
interfaces  similarly  to  monolayers.  However,  the  average  thickness  of  a  molecular  layer 
within  the  multilayer  films  is  still  two  -  three  times  higher  for  dendrimers  than  for  amorphous 
polyions  with  comparable  molecular  weight  despite  their  highly  compressed  state  (Figure 
10). 


Typical  multilayer  growth  pattern  (thickness  versus  number  of  deposited  layers)  for 
CML20/AL20  latexes  with  20  nm  diameter  is  shown  in  Figure  7.  The  average  increment  for 
the  first  five  layers  is  about  15  nm  that  corresponds  to  centered  cubic  packing  of  spheres. 
However,  for  n  >  5  the  increment  decreases  to  7.3  nm  per  layer  due  to  incomplete  filling  of 
following  layers  during  the  film  growth.  Surface  roughness  gradually  increases  for  the  first 

five  layers  and  reaches  a  constant  value  of  22  ±  2  nm  for  thicker  films.  This  suggests  that 
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Figure  10.  Film  thickness  growth  for  different  polymer  pairs. 


some  equilibrium  growth  process  is  established  during  this  stage  with  roughly  two  layers 
being  “under  construction”  during  each  deposition  cycle. 

Variation  of  pH  of  latex  solution  and  activation  of  substrate  with  different  pHs  allow 
controlling  a  fine  balance  of  interparticle  and  particles-substrates  interactions  due  to  change 
surface  charge  of  ionizable  surface  groups.  We  tested  various  combinations  of  repulsive- 
attraction  interaction  strengths  for  PS  latexes.  We  observed  that  weak  repulsive  interaction 
among  nanoparticles  (e.  g.,  pH  =  7  -  8  for  amine  terminated  particles  with  pK  =  9)  combined 
with  strong  attraction  of  highly  charged  substrate  (a  bare  silicon  activated  by  solution  with 
pH  =  9)  and  slow  pulling  through  the  meniscus  produces  perfect  monolayers  and  bilayers 
expanded  over  a  surface  area  of  a  fraction  of  a  millimeter  across  (Figure  11). 

Fourier  image  shows  a  net  of  reflexes  similar  to  a  single-crystal  pattern  in  reciprocal 
space  which  corresponds  to  long-range  hexagonal  packing  of  nanoparticles  with  interplanar 
distance  (110)  of  185  nm  (Figure  11).  This  example  represents  the  first  tmly 
macroscopically  ordered  monolayer  of  charged  latex  nanoparticles  obtained  by  force  assisted 
electrostatic  deposition  at  mild  ionic  conditions. 
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Figure  11.  AFM  image  of  latex  nanoparticles  organized  in  a  perfect  monolayer  (a  defected  area 
is  selected  to  underline  smooth  part)  and  corresponding  Fourier-transformation. 
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Multilayer  films  from  organic-inorganic  nanoparticles. 


Inorganic  composite  nanoparticles  were  tested  as  a  prospective  component  for 
multilayer  organic-inorganic  films  to  enhance  gradient  of  refractive  index.  Nanoparticles 


Figure  12.  Organic-inorganic  multilayers  from  latex/gold  nanoparticles 


tested  are  gold-core  spheres  with  silicon  oxide  shell  of  an  average  diameter  of  50  nm  obtained 
from  Melbourne  University  (Figure  12).  Surface  of  these  particles  is  terminated  with  silanol 
groups  SiOH  which  is  negatively  charged  at  neutral  pH.  These  nanoparticles  can  be  used  for 
monolayer  fabrication  on  positively  charged  surfaces  (amine  SAMs)  or  as  countepair  for 
positively  charged  amidine  latex  nanoparticles  of 40-50  nm  in  a  diameter.  Our  first  attempts 
of  monolayer  fabrication  showed  promising  results.  Monolayers  with  dense  packing  of  gold 
nanopaiticles  can  be  formed  on  amine  terminated  SAMs  (Figure  13).  Further  studies  are 
required  for  utilization  of  these  inorganic  nanoparticles  for  fabrication  of  multilayer  films. 


20  25  30  35  40  45  50  55  60  65  70  75  80  85  90  95 1 0i  1 0  11 0 


D,  A 

Figure  13.  AFM  image  of  monolayers  of  composite  gold  nanoparticles  obtained  by 
electrostatic  deposition  and  histogram  of  height  distribution. 


Conclusions  and  prospectives 

From  a  comparison  of  the  growth  modes  of  multilayer  films  composed  of  amorphous 
polyionic  maferials  and  dendrimer/latex  nanoparticles,  we  can  conclude  that  replacing 
unstmctured  coiled  macromolecular  chains  with  organized  dendritic  supramolecules  or  “bulk” 
nanoparticles  results  in  a  significant  increase  in  the  growth  increment.  Correspondingly, 
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internal  periodicity  of  these  supramolecular  multilayer  self  assembled  films  is  about  an  order 
of  magmtude  higher  than  for  conventional  films.  This  type  of  structural  organization  is 
unachievable  for  amorphous  polyelectrolytes  with  random  spatial  distribution  of  “sticky” 
groups.  Obviously,  the  routine  proposed  may  be  used  for  formation  of  supramolecular  films 
with  mesoscale  periodicity  and  intriguing  optical  properties. 

General  observation  can  be  made  that  force  assisted  electrostatic  self  assembly  of 
charged  nanoparticles  under  dipping  or  spinning  conditions  may  result  in  fabrication  of 
relatively  large  uniform  layers  (a  fraction  of  a  millimeter  across)  with  long-range  positional 
and  orientational  ordering  of  nanoparticles  within  monolayers.  However,  only  a  fine  balance 
of  interparticle  (particle-particle  and  particle-substrate)  interactions  and  external  forces  within 
a  narrow  window  may  be  useful  for  fabrication  of  perfect  films.  We  present  the  first  truly 
macroscopically  ordered  monolayer  of  charged  latex  nanoparticles  obtained  by  force  assisted 
electrostatic  deposition.  Lateral  sizes  of  this  monolayer  are  extended  to  a  fraction  of  a 
millimeter. 

Prospectives  and  trends 

Several  apparent  problems  and  possible  prospective  trends  can  be  deduced  from  our 
introductory  study.  These  problems  and  trends  should  addressed  in  more  extensive  project 
on  a  long  term  support  basis. 

Problems  to  address  are: 

fabrication  of  complete,  compact,  and  uniform  molecular  layers  from  nanoparticles  with 
strong  repulsive  interactions,  finding  a  right  balance  of  interparticle  and  particle-substrate 
interactions  and  surface  mobihty 

-  stable  growth  of  multilayer  films  with  a  large  number  of  layers  and  prevention  of  in-plane 
phase  separation  at  certain  level  when  3D  bulk  microphase  structure  becomes  the  most 
favorable 

-  “freezing”  internal  gradient  of  component  composition  along  the  surface  normal  for  an 
indefinite  long  time 

Prospective  directions  to  go  and  general  trends  could  be  defined  as  follow: 

-  replacing  self-assembly  with  “force  assisted”  self-assembly  by  addition  active  external 
gradient  to  overcome  long-range  repulsive  interactions  among  mesoscale  nanoparticles: 
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capillary  forces  (dip-coating),  shearing  flow  (spin-coating),  and  electrostatic  interactions 
(substrate  potential  variation  and  degree  of  ionization  of  functional  groups) 

-  replacing  pure  polymer-polymer  systems  with  organic-inorganic  systems  to  enhance 
gradient  of  chemical  composition  and  refractive  index;  appropriate  inorganic  nanoparticles 
“compatible”  with  existing  polymer  components  should  thought 

-  testing  a  new  generation  of  supramolecular  assemblies  which  should  include  dendritic 
macromolecules  with  rigid  architecture  and  functional  surface  groups  able  to  form 
mesomorphic  phases  and  zwitterioninc/dye  containing  latex  nanoparticles  with  functional 
surface  groups 
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DISTRIBUTED  CONTROL  OF  NONLINEAR  FLEXIBLE  BEAMS  AND  PLATES 
WITH  MECHANICAL  AND  TEMPERATURE  EXCITATIONS 


H.  S.  Tzou 
Professor 

Department  of  Mechanical  Engineering 
University  of  Kentucky 
Lexington,  KY  40506-0108 


Abstract 

Beam  and  plate— type  components  are  widely  used  in  many  aerospace  structures. 
Imposed  shape  changes  and  surface  control  of  flexible  beams  and  plates  could  offer  many 
aerodynamic  advantages  in  flight  maneuverability  and  precision  control.  Deformed  shapes 
and  surfaces  often  involve  nonlinear  deformations.  Studies  of  control  of  nonlinear  behavior 
related  to  the  deformed  surfaces  and  shape  changes  would  provide  detailed  information  in 
future  controlled  surface  design  and  implementation.  This  research  is  concerned  with  the 
control  effectiveness  of  nonlinearly  deformed  beams  and  plates  based  on  the  smart 
structures  technology.  Piezoelectric  materials  are  widely  used  as  sensors  and  actuators  in 
sensing,  actuation,  and  control  of  smart  structures  and  structronic  systems.  Control 
effectiveness  of  piezoelectric  laminated  nonlinear  flexible  beams  and  plates  subjected  to 
mechanical  and  temperature  excitations  is  investigated.  It  is  assumed  that  the  flexible 
beams  and  plates  encounter  the  von  Karman  type  geometrical  nonlinearity. 
Thermoelectromechanical  equations  and  boundary  conditions  including  elastic, 
temperature,  and  piezoelectric  couplings  are  formulated  first,  and  analytical  solutions 
derived  next.  Dynamics,  active  control  of  nonlinear  flexible  deflections,  thermal 
deformations,  and  natural  frequencies  using  distributed  piezoelectric  actuators  are  studied, 
and  their  nonlinear  effects  are  evaluated. 
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DISTBIBUTED  CONTROL  OF  NONLINEAR  FLEXIBLE  BEAMS  AND  PLATES 
WITH  MECHANICAL  AND  TEMPERATURE  EXCITATIONS 


H.  S.  Tzou 


INTRODUCTION 

Recent  development  of  smart  (or  intelligent)  structures,  structronic 
(structure-electronic)  systems,  and  micro  mechanical  systems  has  demonstrated  the 
versatilities  of  piezoelectric  materials  in  both  sensor,  the  direct  piezoelectric  effect,  and 
actuator,  the  converse  piezoelectric  effect,  applications  (Tzou  and  Anderson,  1992;  Tzou 
and  Fukuda,  1992).  Piezoelectrics  are  usually  bonded  (embedded  or  surface  coupled)  with 
elastic  structures,  serving  as  sensors  and/or  actuators  for  structural  monitoring  and 
control.  These  piezoelectric  sensors  and  actuators  can  be  further  classified  as  "discrete"  or 
"distributed"  devices.  The  distributed  sensors  and  actuators  offer  many  advantages  over 
the  discrete  devices,  such  as  multiple  modal  controls,  spatial  filterings,  spatially  shaped 
modal  controls,  etc  (Tzou,  1993).  Accordingly,  distributed  piezoelectric  sensors  and 
actuators  are  widely  used  in  various  structural  applications. 

In  recent  years,  many  sophisticated  analyses  are  performed  and  new  engineering 
applications  are  explored.  However,  most  of  these  studies  were  conducted  based  on  the 
linear  theories,  i.e.,  linear  elasticity  and  piezothermoelectricity.  It  is  known  that 
piezoelectric  materials  are  rather  nonlinear.  In  addition,  large  oscillation  and/or  flexibility 
of  elastic  structures  can  introduce  large  deflections  of  distributed  sensors  and  actuators. 
Thus,  the  nonlinear  characteristics  of  piezoelectric  materials  and  laminated  structures  can 
be  classified  into  two  categories:  1)  the  geometrical  nonlinearity  and  2)  the  material 
nonlinearity.  The  former  usually  involves  large  deformations  and  the  latter  is  associated 
with  nonlinear  material  properties,  e.g.,  hysteresis,  temperature  dependent  material 
constants,  etc  (Tzou  and  Bao,  1996).  Pai,  et  al.  (1993)  recently  studied  a  nonlinear 
composite  plate  laminated  with  piezoelectric  layers.  Yu  (1993)  reviewed  recent  studies  of 
linear  and  nonlinear  theories  of  elastic  and  piezoelectric  plates.  Lalande  et  al.  (1993) 
investigated  the  nonlinear  deformation  of  a  piezoelectric  Moonie  actuator  based  on  a 
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simplified  nonlinear  beam  theory.  Sreeram  et  al.  (1993)  investigated  a  nonlinear  hysteresis 
modeling  of  piezoceramic  actuator.  Librescu  (1987)  proposed  a  refined  geometrical 
nonlinear  theory  of  anisotropic  laminated  shells.  Linear  thermo— electromechanical 
behavior  of  distributed  piezoelectric  sensors  and  actuators  were  also  recently  studied  (Tzou 
and  Ye,  1994;  Tzou  and  Howard,  1994).  A  theory  on  geometrical  nonlinearity  of 
piezothermoelastic  shell  laminates  simultaneously  exposed  to  mechanical,  electric,  and 
thermal  fields  has  been  recently  proposed  (Tzou  and  Bao,  1996).  Tzou  and  Zhou  (1995) 
investigated  static  and  dynamic  control  of  a  circular  plate  with  geometrical  nonlinearity. 
This  research  is  devoted  to  a  study  of  distributed  static  and  dynamic  control  of  nonlinear 
flexible  beams  and  rectangular  plates  with  geometrical  nonlinearity  using  distributed 
piezoelectric  actuators. 

Since  flexible  beams  and  plates  (rectangular  and  square  plates)  are  very  common  in 
aerospace  structures,  static  and  dynamic  behaviors  of  piezothermoelastic  laminated  flexible 
beams  and  plates  with  initial  large  nonlinear  deformations  (the  von  Karman  type 
geometrically  nonlinear  deformations)  and  subjected  to  mechanical,  electric,  and 
temperature  excitations  are  investigated  in  this  research.  Active  control  effects  on 
nonlinear  static  deflections  and  natural  frequencies,  including  temperature  variations, 
imposed  by  the  piezoelectric  actuators  are  investigated.  Nonlinear  beam  and  plate 
equations  are  derived  first,  followed  by  nonlinear  static  analysis  and  free  vibration  analysis 
including  the  effect  of  initial  nonlinear  deformations.  Active  control  of  nonlinear  effects  by 
the  piezoelectric  actuators  are  emphasized.  Numerical  examples  are  provided  and 
simulation  results  discussed. 

CONSTITUTIVE  EQUATIONS 

This  study  focuses  on  the  distributed  control  effectiveness  of  nonlinear  deformations 
and  dynamic  frequencies,  including  mechanical,  electric,  and  temperature  effects.  Since  the 
fundamental  governing  equations  are  derived  from  the  triclinic  piezoelectric  materials  and 
thin  anisotropic  piezothermoelastic  shells,  original  generic  piezothermoelastic  governing 
equations  are  briefly  reviewed.  The  relations  between  the  electric  fields  Ej,  E2,  E3  and  the 
electric  potential  <p  in  the  curvilinear  coordinate  system  are  (Tzou  and  Zhong,  1993) 
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The  constitutive  relations  of  the  piezothermoelastic  shell  are  governed  by  three 
equations:  1)  a  stress  equation  {T},  2)  an  electric  displacement  equation  {D},  and  3)  a 
thermal  entropy  equation  ^  (Tzou  and  Ye,  1994). 

•{T}  =  [c]{S}-[e]'{E}-{A}«.  (2) 

•  {D}  =  [e!{S}  +  [£]  {E}  +  {p}«  .  (3) 

.  »  =  {A}‘{S}  +  {P}*{E}  +  a,  e,  (4) 


where  {T},  {S},  {E}  and  {D}  denote  the  stress,  strain,  electric  field  and  electric 
displacement  vectors,  respectively;  ^  is  the  thermal  entropy  density;  6  is  the  temperature 
rise  {6  =  0  —  0o  where  0  is  the  absolute  temperature  and  0o  the  temperature  of  natural 
state  in  which  stresses  and  strains  are  zero);  [c],  [e],  and  [e]  denote  the  elastic  stiffness 
coefficient,  piezoelectric  coefficient,  and  dielectric  permittivity  matrices,  respectively;  {A} 
is  the  stress— temperature  coefficient  vector;  {p}  is  the  pyroelectric  coefficient  vector;  and 
av  is  a  material  constant  (oiy  =  pcv/0o  where  Cy  is  the  specific  heat  at  a  constant  volume). 
[•]*^  and  {•}*  are  matrix  and  vector  transpose,  respectively.  For  a  generic  anisotropic 
piezothermoelastic  material,  the  stress  and  electric  displacement  equations  become  (Tzou 
and  Bao,  1994) 
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Note  that  if  the  effective  axes  of  the  piezothermoelastic  material  do  not  coincide  with  the 
geometrical  axes,  an  orientation  or  transformation  matrix,  in  directional  cosines  and  sines. 
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needs  to  be  defined  (Tzou  and  Bao,  1995).  The  anisotropic  piezothermoelastic  material  is 
used  in  deriving  the  generic  piezothermoelastic  shell  system  equations;  simplifications  to 
other  simpler  piezothermoelastic  materials,  e.g.,  polyvinylidene  fiuoride,  piezoceramics, 
etc.,  are  then  explored. 

VON-KARMAN  NONLINEARITY 


It  is  assumed  that  the  nonlinear  characteristics  are  introduced  by  large  deformations 
which  can  be  introduced  mechanically,  electrically,  and/or  thermally.  A  generic  nonlinear 
deflection  Ui  in  the  i— th  direction  can  be  expressed  as  a  summation  of  a  membrane 
displacement  Ui(ai,a2,t)  and  a  higher  order  nonlinear  shear  deformation  effect  represented 
by  the  summation  of  angular  rotations  ^ij(a!i,Q!2,t)  (Tzou  and  Bao,  1996): 


m 


U.(ai,Q!2,Q!3,f)  =  u.(Q;i,Q!2,f)  +  S  az^p.^.(aua2,t)  ,  i=l,2,3  , 

j=l 


(7) 


where  ui,  U2  and  ua  are  the  mid-plane  displacement  components  of  the  reference  surface 
along  the  oi,  02,  and  0:3  axes,  respectively;  and  ^21  represent  the  rotational  angles  in  the 
positive  sense  of  the  oi  and  02  axes,  respectively;  and  0^^  =  0.  This  expression  includes 

higher  order  nonlinear  shear  deformation  effects.  However,  according  to  the 
Love-Kirchhoff  thin  shell  assumptions  and  a  linear  displacement  approximation  (first  order 
shear  deformation  theory),  only  the  first  term  is  kept  in  the  equation,  i.e.,  m  =  1  (Tzou, 
1993).  The  displacements  and  rotational  angles  are  independent  variables  in  thick  shells. 
However,  the  rotational  angles  are  dependent  variables  in  thin  shells,  and  they  can  be 
derived  from  the  thin  shell  assumptions  in  which  the  transverse  normal  strain  S3  is 
negligible  and  shear  strains  S4  and  S5  are  zeros.  Based  on  the  thin  shell  assumptions,  the 

rotational  angles  0i  =  0^  and  02  =  02i  are  derived  from  the  transverse  shear  strain 
equations,  i.e.,  S4  =  0  and  S5  =  0. 


01  = 


Ui  _  1  ^3 

RI  A7  da  I  ’ 


and 


02  = 


U2  _  1  6>U3 

R2  A2  da2 ' 


(8a,b) 


In  general,  a3/Ri<<l  Q!3/R2<<1,  thus,  the  ratios  of  the  finite  distance  to  the  radius 
of  curvature  are  negligible,  i.e.,  fn  ~  Ai  and  f22  2;  A2.  It  is  assumed  that  the 
piezothermoelastic  shell  experiences  large  deformations  in  three  axial  directions.  However, 
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in  general,  the  in— plane  deflections  are  still  much  smaller  than  the  transverse  deflections. 
Thus,  the  nonlinear  effects  due  to  the  in— plane  large  deflections  are  usually  neglected,  i.e., 
the  von  Karman— type  assumptions  (Palazotto  and  Dennis,  1992;  Chia,  1980).  The 
nonlinear  strain— displacement  relations  of  a  thin  shell  with  a  large  transverse  deflection  U3 
include  a  linear  effect,  denoted  by  a  superscript  I,  and  a  nonlinear  effect,  denoted  by  a 
superscript  n,  induced  by  the  large  deformation: 
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where  the  subscripts  1  and  2  respectively  denote  two  normal  strains  and  6  is  the  in— plane 
shear  strain.  Detailed  membrane  and  bending  strains  are  functions  of  displacements  u-’s. 


1)  Membrane  strains: 
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where  s\,  s°2  and  s°g  are  the  membrane  strains  and  k\,  K2  and  «6  are  the  bending  strains 

(the  change  of  curvatures  on  the  reference  surface).  Note  that  the  quadratic  terms 
(nonlinear  terms)  inside  the  brackets  are  contributed  by  the  large  deflection.  Membrane 
force  resultants  Ny  and  bending  moments  Mij  of  the  piezothermoelastic  shell  laminate  can 
be  derived  based  on  the  induced  strains: 
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It  is  observed  that  there  are  three  components,  i.e.,  mechanical,  electric,  and  temperature, 
in  the  force/moment  expressions.  Superscripts  e  and  6  respectively  denote  the  electric  and 
temperature  components.  The  membrane  strains  and  bending  strains  are  coupled  by  the 
coupling  stiffness  coefficients  Bij  in  elastic  force/moment  resultants.  Ny®  and  Nij®  are  the 
electric  and  temperature  induced  forces;  Mij®  and  MijO  are  the  electric  and  temperature 
induced  moments,  respectively.  In  actuator  applications,  these  electric  forces  and  moments 
are  used  to  control  shell’s  static  and  dynamic  characteristics. 

NONLINEAR  PIEZOELECTRIC  SHELL  COMPOSITES 


Mathematical  models  of  the  flexible  beams  and  plates  (rectangular  and  square 
plates)  are  derived  from  a  generic  theory  of  nonlinear  thin  anisotropic  piezothermoelastic 
shells.  A  generic  anisotropic  deep  piezothermoelastic  shell  is  defined  in  a  curvilinear 
coordinate  system,  and  it  is  exposed  to  mechanical,  electric,  and  thermal  excitations. 
Figure  1  illustrates  the  original  generic  shell  and  its  derivative  geometries  including  plates, 
beams,  and  other  shell,  non— shell  geometries.  It  is  assumed  that  the  shell  is  subjected  to  a 
large  deformation  resulting  in  a  geometrical  nonlinearity.  However,  material  properties  are 
assumed  constant,  and  the  stress  and  strain  relations  are  linear.  The  generic  theory  is 
derived  based  on  a  generic  anisotropic  piezothermoelastic  thin  shell.  Simplification  of  the 
generic  theory  to  other  piezoelectric  materials,  e.g.,  mm2  (polyvinylidene  fluoride),  mm6 
(piezoceramics),  etc.,  or  piezoelectric  continua,  e.g.,  spherical  shells,  cybndrical  shells, 
plates,  beams,  etc.,  can  be  achieved  when  appropriate  material  or  geometrical  parameters 
are  defined  (Tzou,  1993).  A  generic  infinitesimal  distance  ds  in  a  shell  can  be  defined  by  a 
fundamental  form: 

(ds)2  =  (fiO^(daO^  ,  (17) 
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where  fii(ai,Q!2>Q!3)  =  Ai(l+  (i=  1,2),  f33(ai,a2,a3)  =  1;  013  is  a  finite  distance 

measured  from  the  reference  surface;  Ai  and  A2  are  the  Lame  parameters;  and  Ri  and  R2 

are  the  radii  of  curvature  of  the  0:1  and  0:2  axes  on  the  surface  defined  by  az  =  0.  For 

3 

beams  and  rectangular  plates,  Aj  =  A2  =  1  and  Ri  =  R2  =  od.  Thus,  (ds)^  =  (l)^(dai)^ 

For  convenience,  the  neutral  surface  is  taken  as  the  reference  surface,  which  is  defined  by 
the  tti  and  az  axes. 


Fig.l  A  nonlinear  piezothermoelastic  shell  and  its  derivative  geometries. 


Hamilton’s  principle  is  used  in  deriving  the  system  equations  and  boundary 
conditions  of  the  piezothermoelastic  shell  continuum.  Simplifying  the  shell  equations  gives 
the  nonlinear  equations  of  flexible  beams  and  plates.  Hamilton’s  principle  assumes  that  the 
energy  variations  over  an  arbitrary  period  of  time  are  zero.  Substituting  all  energies  into 
Hamilton’s  equation  and  carrying  out  all  derivations,  one  can  derive  the  nonlinear 
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piezothermoelastic  equations  and  boundary  conditions  of  the  nonlinear  piezothermoelastic 
shell  (Tzou  and  Bao,  1996). 
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where  P  =  5;  j^^jPk^k  is  defined  as  a  weighted  average  density  for  the  multi-layered  shell. 

It  is  observed  that  the  nonlinear  influence  on  the  transverse  equation  U3  is  very  prominent. 
(All  terms  inside  the  brace  are  contributed  by  the  nonlinear  effects  of  the  von  Karman  type 
geometric  nonlinearity.)  Note  that  the  thermo— electromechanical  equations  look  like  a 
standard  shell  equations.  However,  the  force  and  moment  expressions  defined  by 
mechanical,  thermal,  and  electric  effects  are  much  more  complicated  than  the  conventional 
elastic  expressions.  Substituting  the  expressions  of  Nn,  N22,  N12,  Mn,  M22,  M12  into  the 
above  equations  leads  to  the  thermo— electromechanical  equations  defined  in  the  reference 
displacements  uj,  U2,  U3.  The  transverse  shear  deformation  and  rotatory  inertia  effects  are 
not  considered.  The  electric  terms,  forces  and  moments,  can  be  used  in  controlling  the 
mechanical  and/or  temperature  induced  excitations  (Tzou  and  Ye,  1994).  For  nonlinear 
flexible  beams  and  plates  (rectangular  and  square  plates),  Ai  =  A2  =  1  and  Ri  =  Rj  =  od. 
Substituting  the  Lame  parameters  and  radii  into  the  shell  thermo— electromechanical 
equations  and  simplifying,  one  can  derive  the  governing  equations  for  nonlinear 
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piezoelectric  laminated  beams  and  plates.  Accorgingly,  thermo-electromechanical 
couplings  and  control  of  static/dynamic  nonlinearities  can  be  investigated.  Detailed 
derivations  are  presented  next. 

NONLINEAR  PIEZOELECTRIC  COMPOSITE  PLATES 

For  piezoelectric  laminated  composite  rectangular  plates,  Figure  2,  the  coordinate 
system,  radii  of  curvatures,  and  Lame  parameters  are  defined  as  follows:  at  =  x,  02  =  y,  as 
=  z,  Ri  =  CD,  R2  =  OD,  Ai  =  1,  and  A2  =  1. 


It  is  assumed  that  an  elastic  plate,  with  dimensions  2ax2b>‘h,  is  sandwiched  between 
two  piezoelectric  layers  and  the  composite  plate  experiences  a  large  deformation.  The 
resultant  membrane  forces  and  the  bending  moments  are 
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'N, 

N, 

N, 


Nx^'' 
N  “ 


Nxi' 
N  ® 

■^''xy. 


'Nxx»' 

Nyy® 

Nxy6 


Mxx 

Mxyj 


Mx5' 
xyj 


M 


Mxl‘ 

My| 

Mxl. 


Mxx®' 

MyyO 

Mx>. 


(23) 


where  the  superscript  "m"  denotes  the  mechanically  induced  components,  "e"  the 
electrically  induced  components  and  the  thermally  induced  component.  The 
mechanical  membrane  forces  and  moments  are 


fNx??] 

S°  X  +  y 

N  ” 

L^^'xyJ 

=  K 

S  y  +  fJS"  X 

ky(l-/‘)/2j 

(24) 


Yh 


where  K  =  h=2d  is  the  thickness  of  elastic  plate; 
fi  is  Poisson’s  ratio,  and  Y  is  the  Young’s  modulus; 


rMx§? 

M  "• 

>VixyJ 

Nxl' 

Ny| 

N  « 
/^xy, 

Mx^ 

My| 

Mxfj 

■Nxx«l 

Nyy® 

NxySj 

Mxx® 

Myy® 

Mx> 


=  D 


“1“ 

Ky  +  flKx 

^y(  ^“■/^)/2j 


—  —  631(^31+^33) 


n 

1 

0 


Yh  3 

where  D  =  j2(T— 


,  where  <pz^  is  the  total  voltage  across  the  j— th  layer; 


=  1 


[Axl 

Ay 

0  das  = 

’+d 

r  Axl 

Ay 

[0  J 

-d 

0 

where  t  is  the  thickness  of  piezoelectric  layers; 


^das; 


[Ax] 

’+d 

[Axl 

Ay 

^as  das  = 

Ay 

1  0  J 

' 

-d 

_  0 

0az  dtts  . 


The  relations  of  strain— displacement  are 


fs,  1 

'S°x  ■ 

Kx 

s’’ 

L^xyJ 

— 

s°y 

+  z 

Ky 

.®°xy. 

.^xy. 

(25) 


where  the  membrane  and  bending  strains  are 
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(26) 


_„  5Uv  I  1/^Z'\2  .  „  —  9Py  _  • 

+  ’  ^-W~  W^' 

.  fltlx  .  ^v  I  5Uz  ^z  .  „  _ 

Sxy  =  -^  +  ^+^^>  + 


Note  that  the  quadratic  terms  in  the  membrane  strains  are  the  von  Karman-type  nonlinear 
terms.  The  inplane  inertia  effects  of  thin  plates  are  usually  neglected  and  the  in-plane 
mechanical  forces  are  assumed  zero,  i.e.,  qx  =  Qy  =  0-  Then,  the  nonlinear  plate  equations 
are  derived  as 


5Nxx  . 
-~dT^ 


dm 


Lxx 


=  0 

oy 


5Nxy  , 


SNyy 

'■ 


0 


The  static  version  of  these  equations  are  those  of  the  von  Karman  plate  theory.  These 
simplified  static  equations  are  identical  to  those  in  (Chai,  1980). 


Substituting  all  force  and  moment  expressions  (mechanical,  thermal,  and  electric 
components)  into  the  plate  equations  yields  three  piezothermoelastic  equations  in  terms  of 
the  displacements  Ux,  Uy  and  Uz- 
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Furthermore,  substituting  all  force  and  moment  expressions,  one  can  derive  the 
displacement  equations  as 


,  _  d^Ux  I  52Uv  _  /52Uz  ,  „  52Uz\ 

-  ^2 + -^[(^  +  %^)  +  (^  +  ^)]  ; 

52uz  ,  1  r/5Nx?  .  aNxv^  .  /(SNv?  , 

"^2  ^  ^  +  -7rI(-Sf  +  -^)  +  (-Sf  +  : 

D  V=V’u.  +  „hu.  =  q.  +  |^)|^ 

+  (|^ + + 4^)  +  {|^ + + 4?)] 

-  [(N^  +  N„6)^  +  2(N4  +  N,,9)|^  +  (NyJ  +  N„e)|^] 

_  ^  a2Mxx^  .  /o52Mx5  .  o<92MxvS  ,  /52My«  ,  -92MyAi 

+  -yx2  )  +  +  -g-y  F  )]  • 

where  r}^  =  is  the  Laplacian  operator,  and  ^  . 


(32) 


(33) 


(34) 


For  a  uniformly  distributed  piezoelectric  layer  with  constant  thickness,  the  electric 
potentials  on  the  piezo-layers  are  independent  of  the  coordinates  x  and  y.  It  is  also 
assumed  that  the  temperature  rise  is  uniform  with  respect  to  the  x  and  y  coordinates, 
^(X)yiz)=  ^(z).  Occasionally,  introducing  a  generic  forcing  function  and  representing  the 
existing  force  terms  by  the  generic  forcing  function  in  certain  boundary— value  problems 
would  significantly  simplify  the  system  equations  and  also  alleviate  the  complexity  of 
solution  procedures.  Thus,  a  generic  forcing  function  F(x,y)  is  introduced  and  the  forces 
are  defined  by 


N  "> 


XX 


52F 


N 


52F 

&c2 


and 


lu  m  — 
INxy  — 


g2F 
cbc^y  ■ 


(35) 


Note  that  the  two  in— plane  equations  are  exactly  satisfied  and  the  transverse  equation  is 
expressed  in  terms  of  the  transverse  deflection  U3  and  the  forcing  function  F. 
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D  V2v2u2  +  phiiz  =  qz  +  N{F,u^) 

-  [(N:^  +  N„9)|^  +  2(N,5  +  +  (Nyf  +  N„6)|^l  , 


where  ^(F,Uz)  is  the  nonlinear  operator  defined  by 

^  52F  52Uz  ,  52F  52Uz  o52F  52Uz  foy) 

The  second  equation  is  the  compatibility  equation  of  the  plate.  Applying  the  membrane 
strain  equations  leads  to  the  compatibility  condition: 


g2s°x  .  d2s\  52s‘ 
£^2  3x2  £iX( 


^  ftc2  £^2  • 


rNx5J]  r  1  //  0 

Since  the  forces  are  functions  of  membrane  strains  NyS  =  A"  fi  1  0 


0  0  (l-//)/2  JLs“xy. 


strains  can  be  expressed  as  functions  of  forces 


T  [1  M  0  VifNxr 

=  — Tjr—  /j,  1  0  ^yy 

^  [O  0(1-m)/2]  [NxJ, 


,  r  1-M  0  IfNxSJ 

1?7T 5T  10  N  y  y 

I  0  0  2(l+/x)J  [Nxf. 


Referring  to  the  forces  represented  by  the  generic  forcing  function  ,  i.e.,  Nxx  = 
can  also  define  the  strains  as  functions  of  the  forcing  function. 

*■  *  — -  '‘Ny?)  = 

'•y  =  -Y^Ny?-/iN.;)  =  -y^^-pP);  (41) 

2(l+/l)„.  2(l+riaiF 

8-y-  'vh'^'Nxy--  'vh^'axay  '  ' 

The  second  equation  then  can  be  obtained  by  substituting  these  equations  into  the 
condition  of  compatibility: 
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V^V^F(x,y)  H — ^  iV(u2,Uz)  =  0  . 


(43) 


Since  the  closed— form  solutions  to  these  nonlinear  equations  are  usually  not 
available,  the  approximation  methods  are  used  to  solve  these  equations.  The 
approximation  techniques  include  the  (generalized)  double  Fourier  method,  Rayleigh— Ritz 
method,  Galerkin’s  method,  perturbation  technique  and  finite  difference  technique,  etc. 
Simulation  results  based  on  the  series  solutions  are  presented  later. 

Four  boundary  conditions  are  usually  required  for  each  edge  of  the  plate.  Electric 
boundary  conditions  are  also  allowed  and  from  which  boundary  control  can  be 
implemented.  Generic  boundary  conditions  are 


Un  =  Un* 

or 

Nnn  —  Nnn*  J 

Us  =  Us* 

or 

Nns  =  Nns*  I 

Uz  =  Uz* 

or 

Qn  - 

^Mns*  j-  Q  * 

5Uz  _  d\lz* 
drx  dn 

or 

Mnn  =  Mnn*  > 

(44) 


in  which  the  subscripts  n  and  s  denote  the  directions  outward  normal  and  tangential  to  the 
boundary,  respectively,  and  the  starred  quantities  designate  the  prescribed  value.  Typical 
mechanical  boundary  conditions  for  the  plate  are: 

a)  Rigidly  clamped  edge:  u^  =  -^  =  Un  =  Us  =  0;  (45a-e) 

b)  Loosely  clamped  edge:  Uz  =  -^  =  Nnn  =  Nns  =  0; 

c)  Simply  supported  edge  (movable  in  the  plane  of  the  plate) 

Uz  =  Mjin  =  Njin  =  Nns  —  Oj 

d)  Hinged  edge  (simply  supported  edge  immovable  in  the  plane  of  the  plate): 

Uz  =  Mnn  =  Un  =  Us  =  Oj 

e)  Free  edge:  Nnn  =  Nns  =  Mnn  =  Qn  +  =  0. 
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NONLINEAR  PIEZOELECTRIC  COMPOSITE  BEAMS 


A  piezothermoelastic  laminated  beam  is  shown  in  Figure  3  in  which  two 
piezoelectric  layers,  dimensions  Lxb^hp,  are  perfectly  bounded  on  the  top  and  the  bottom 
surfaces  of  a  steel  beam,  dimensions  L^b^he.  Thus,  the  total  laminated  beam  thickness  is  h 
=  he  +  2hp.  It  is  assumed  that  the  laminated  beam  undergoes  a  von  Karman  type 
geometrical  nonlinearity  and  temperature  and  electric  inputs. 


Fig.3  A  piezothermoelastic  laminated  beam. 


Simplifying  the  governing  equations  of  the  nonlinear  piezothermoelastic  sheU 
laminate  (Tzou  and  Bao,  1995),  one  obtains  the  nonlinear  piezothermoelastic  beam 
equations  in  the  longitudinal  (x)  direction  and  the  transverse  (z)  direction,  respectively. 


i^+q,  =  phux, 

g^Mxx  ,  ^XX  ^2 
dyi  2  cix  ftc 


+  N 


XX 


52Uz 


+  qz  =  phuz  , 


(46) 

(47) 
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where  the  mass  per  unit  length  ph  =  S  Jp^dz  =  2pphp  +  pehei  Pp  and  pe  are  the  densities  of 

the  piezoelectric  layer  and  the  elastic  steel  layer,  respectively.  Nxx  and  Mxx  are  the 
membrane  force  and  bending  moment  per  unit  width.  Since  the  beam  width  b  is  constant, 
one  can  define  Nx  =  bNxx  and  Mx  =  bMxx-  Note  that  the  forces  and  moments  include  all 
elastic,  electric,  and  temperature  effects. 

N,  =  (YA  +  2YpAp  +  2Ap|^)s;,  +  e3,b(03'  +  ^3') 


r/e,,D,  ,  rW2+hp  ^  .  he/2 

/  ^33  "  U-(W2+hp)^d"  +  Jhe/2  "  1-he/2^H  ’ 


Mx  =  (YI  +  2YpIp  +  2Ip|^)/c^^  +  e3,bra(03c  _  ^30) 
+  -  An)  f  f  .  .  .^zbdz  + 


(48a) 


[(^  ^p)[j-(he/2+hp)^^^'^^  + jhe/2  “  ^j-W2^H  ’ 


(48b) 


where  A  =  bhg,  Ap  =  bhp  are  the  cross  section  areas  of  the  elastic  steel  layer  and  the 
piezoelectric  layers,  respectively;  Y  and  Yp  are  Young’s  moduli  of  the  steel  and  the 
piezoelectric  material;  I  =  bheVl2,  Ip  =  bhp 3/12  +  bhp(he+hp)^/2  are  the  area  moments 
of  the  steel  layer  and  the  piezoelectric  layer,  respectively;  (j)^\  and  ^3^  are  the  control 

voltages  applied  to  the  top  and  the  bottom  piezoelectric  layers;  9  is  the  temperature 
variation;  031,  C33,  ps  and  Ap  are  the  piezoelectric  stress  coefficient,  the  dielectric 
coefficient,  the  pyroelectric  coefficient,  and  the  stress— temperature  coefficient  for  the 
piezoelectric  material,  respectively;  A  is  the  steel  stress— temperature  coefficient;  s^^^.  and 

^xx  membrane  strain  and  bending  strain;  and  r^  =  (he+hp)/2  is  the  actuator 

moment  arm.  Also,  one  can  define  px  =  bqx,  Pz  =  bqz  (mechanical  excitations  per  unit 
length),  and  m  =  pbh  =  2ppAp  +  pk  (mass  per  unit  beam  length).  Then,  Eqs. (46,47b)  can 

be  rewriten  as 


5Nx  , 

+  Px  =  mux  , 

52Mx  .  5Nx  5uz  ,  tvt  5^Uz  , 

wr-  +  ^  ^  +  Nx  -^  +  Pz  =  muz 
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Using  Eqs.(48a&b),  one  can  write  the  axial  force  and  bending  moment  in  a  compact  form: 

Nx  =  ^s;,^  +  Nx"  +  NxS  (51a) 

Mx  =  +  Mx"  +  Mx* ,  (51b) 

where  K  is  the  membrane  stiffness  K  =  (YA  +  2YpAp  +  2Ap|^);  D  is  the  bending 
stiffness  D  =  {Yl  +  2YpIp  +  2Ip^^);  s^x  is  the  membrane  strain  and  is  the  bending 

^33 

strain  with  the  von  Karman  type  nonlinearity: 


_  5Ux  ,  1 

-  -sr  +  5  ’ 


_ 

“  dx.  dx.^  ■ 


(52a&b) 


Nx*  is  the  axial  force  induced  by  the  temperature  rise;  Mx*^  is  the  moment  due  to  the 
temperature  rise;  Nx^^  is  the  axial  force  induced  by  the  control  potential;  and  Mx*^  is  the 
control  moment  induced  by  the  control  potential. 


Nx^  =  e3lb(^33  +  </>3i)  , 
Mx^  =  e3ibra(^3c  - 


N.*=  [(5f^-Ap)[|^ 


-he/2 

■(he/2+hp) 


rhe/2+hp 

^dz  +  Jij  ^2 


1  -  ^  _ 


r  „  _  rr  -he/2  rhe/2+hp  -i  r  a.e/ ^ 

Mx^  =  [(-|^  -  ^P)  [J-(he/2+hp)^^bd^  +  jhe/2  "  ^l-he/2^^^d^. 


(53a) 

(53b) 

(53c) 

(53d) 


Boundary  conditions  at  the  two  ends  of  the  laminated  beam,  x  =  0  and  x  =  L,  are 


Nx  =  Nx* 

or 

Ux  =  Ux*  , 

(54a) 

Mx  =  Mx* 

or 

0x  —  ^X*  ) 

(54b) 

5Mx  1  AT  dU-z  _  r\  ± 

or 

Uz  =  Uz*  , 

(54c) 

where  the  quantities  with  the  asterisk  are  prescribed  values  on  the  boundary.  Note 
that  usually  either  force  boundary  conditions  or  displacement  boundary  conditions  are 
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sdected  for  a  given  physical  boundary  condition. 


It  is  assumed  that  the  mechanical  excitations  in  the  longitudinal  and  transverse 
directions  are  zero,  i.e.,  =  px  =  0.  Substituting  the  axial  force  and  bending  moment  into 

Eqs.(49,50),  one  can  write 


+«3ib^(*f+*9+ 

•  (J  -(he/2+hp)®“^  +  J  he/2  "  -'J  -he/2$'>n  =  i 

{-  ^  +  e,.br.J(^3§-«l3;)  +  [(“-Ap) 

•  (l-(he/2+hp)^“"  +  lhe/2  -  A|_^'^2g|zbdz]  | 

+  { ^  +  [(^-Ap) 


+  l  k\^ 


+  e3lb(^|!>33+^i3i)  +  ( 


rhe/2-fhp  #.  fie/z  -i  1  05 

«bdz  +  dbdz)  -  =  miie  . 


£33 


J-(he/2+hp)''"""-"Jhe/2 


(55a) 


(55b) 


Since  the  longitudinal  inertia  is  negligible,  i.e.,  miix  -  0,  factoring  the  partial  derivatives 
and  regrouping  the  force  and  moments  gives 

(56 


52Mx  ,  ThT  dH,  - .. 


Eq.(56)  impUes  that  the  axial  force  Nx  is  not  a  function  of  x,  i.e.,  Nx  =  constant. 
Considering  individual  elastic,  control,  and  temperature  effects,  one  can  further  write 
Eq.(57)  as 


p54Uz  ,  52Mx"  ,  52Mx'  .  52u. 

+  Nx -^  =  muz  , 
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where  Nx  =  K  [^+  +  Nx*.  The  nonlinear  piezothermoelastic  beam 

equation  can  be  further  simplified  when  boundary  conditions  are  specified.  In  the  following 
two  cases,  one  is  used  to  compare  with  the  standard  equation  and  the  other  is  for  a  detailed 
parametric  study. 

1)  Free  Expansion/Contraction: 

If  the  longitudinal  motion  either  at  x  =  0  or  at  x  =  L  is  not  constrained  (free 
expansion/contraction),  the  axial  force  Nx  vanishes  when  the  boundary  conditions  are 
imposed.  The  differential  equation  then  can  be  simplified  to 

-  b  +  f(x,t)  =  miiz  ,  (59) 

where  f(x,t)  =  .  This  is  a  standard  form  of  the  beam  transverse  vibration 

(Meirovitch,  1975).  However,  note  that  the  physical  meaning  is  much  more  complicated 
than  the  conventional  form,  due  to  the  coupling  of  mechanical,  electric,  and  temperature 
fields  in  the  nonlinear  piezothermoelastic  laminated  beam. 

2)  Simply  Supported  with  Both  Ends  Fixed: 

Boundary  conditions  for  a  simply  supported  piezothermoelastic  laminated  beam 
with  both  ends  fixed  are 

Uz  =  Ux  =  0,  and  Mx  =  0  ,  •  (60) 

at  both  beam  ends:  x  =  0  and  x  =  L.  Furthermore,  it  is.  assumed  the  voltage  (f)  and 
temperature  variation  0  are  uniform  in  the  x— direction.  This  implies  that  (p  and  6  are  not 
functions  of  coordinate  x.  Then,  the  transverse  equation  becomes 

-  (8^) 

where  the  axial  force  Nx  =  hT  ^  (^)  1  +  Nx'  +  Ni‘.  Solution  procedures  of  the 
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simply  supported  nonlinear  piezothermoelastic  beam  equation  are  discussed  next. 
Numerical  results  and  control  effectiveness  are  presented  in  case  studies. 

CONTROL  OF  NONLINEAR  DEFORMATION  AND  FREQUENCIES 

As  discussed  previously,  distributed  piezoelectric  layers  laminated  (coupled  or 
embedded)  on  elastic  shell  continua  can  be  used  as  distributed  sensors  and/or  actuators 
(Tzou,  1991;  Tzou,  Zhong,  and  Natori,  1993;  Tzou,  Zhong,  and  Hollkamp,  1994).  Injecting 
high  voltages  into  the  distributed  piezoelectric  actuators  induces  two  major  control  actions. 
One  is  the  in-plane  membrane  control  force(s)  and  the  other  is  the  out-of-plane  bending 
control  moment(s)  (Tzou,  1991;1993).  In  general,  the  control  moments  are  essential  in 
planar  structures,  e.g.,  plates  and  beams  (Tzou  and  Fu,  1994);  the  membrane  control  forces 
are  effective  in  shells  (Tzou,  Zhong,  Hollkamp,  1994).  In  this  study,  the  piezoelectric 
actuators  are  used  to  control  the  nonlinear  large  deformation  and  amplitude-dependent 
frequencies  of  flexible  beams  and  plates,  and  their  control  effectivenesses  are  evaluated. 
General  solutions  of  the  nonlinear  equations  can  be  derived  by  a  number  of  methods,  e.g., 
the  double  Fourier  series  method,  the  Ritz  method,  Galerkin’s  method,  the  perturbation 
method,  etc  (Chia,  1980).  In  order  to  investigate  the  coupling  among  elastic,  electric, 
temperature  and  control  effects  of  the  piezothermoelastic  laminated  beam,  analytical 
solutions,  including  all  design  and  control  variables,  are  derived.  The  solution  procedures 
are  divided  into  two  parts.  The  first  step  is  to  solve  for  nonlinear  static  solutions  and  the 
second  step  is  to  solve  for  dynamic  solutions  with  respect  to  the  nonlinear  static 
equilibnum  position.  Numerical  solutions  are  derived  to  evaluate  the  control  effectiveness 
of  nonlinear  beams  and  plates  in  the  case  studies  presented  next. 

CASE  STUDIES 

Case— 1: 

Control  of  Nonlinear  Piezoelectric  Laminated  Plates 

A  simply  supported  nonlinear  square  plate  (dimensions:  2ax2bxh)  is  considered  in 
the  case  study.  The  piezoelectric  laminated  plate  is  subjected  to  both  thermal  and  electric 
bending  loads:  ^33  =  -  ^3^  =  ^  and  B{z)  =  A0(z/h)  where  ^  and  Bt,  ^  are  the 

temperatures  on  the  top  and  the  bottom  of  the  plate,  respectively.  Thus,  the  thermal  and 
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the  electric  bending  moment  are 


[Myy®]  J_h/2 


fedz  = 


h2 

T5 


A0, 


=  2^3ira 


(62) 

(63) 


A  nondimensional  loading  parameter  S  =  +  ~"6'^P'^"  defined  by  the 

piezoelectric  constant,  the  moment  arm,  Young’s  modulus,  the  thermal  stress  coefficient, 
the  plate  thickness  and  width,  and,  of  course,  the  temperature  and  the  control  voltage. 
The  central  deflections  due  to  the  nondimensional  loading  5  calculated  based  on  the  linear 
and  nonlinear  theories  are  plotted  in  in  Figure  4.  Convergence  of  the  deflection  solutions  is 
sufficiently  fast  and  the  results  obtained  by  a  one— term  approximation  in  the  double-series 
expansion  are  adequate  (Sundara,  et.  al.,  1966). 


Pig.4  Central  deflections  of  a  simply  supported  piezoelectric  laminated  plate. 


Center  deflections  of  the  simply  supported  nonlinear  composite  plate  with  control 
voltages  and  temperature  loadings  are  analyzed  and  results  are  plotted  in  Figures  5—7. 
Note  that  the  elastic  plate  is  made  of  steels  and  the  piezoelectric  material  layers  are  either 
PZT  or  PVDF  materials.  The  plate  dimensions  are:  steel  thickness  h  =  l.OxlO'^m, 
piezoelectric  layer  thickness  hp  =  6.0xl0'5m,  plate  length/width  2a  =  0.5m. 
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Dimensionless  displacement:  u 


1.0 


[PZT] 


Voltage  applied:  0[V] 

Pig.5  Central  deflections  of  plate  versus  voltage  applied  (PZT). 


Voltage  applied:  0[V] 

Pig.6  Central  deflections  of  plate  versus  voltage  applied  (PVDF). 
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Temperature  varied;  0/C] 

Fig.7  Central  deflections  of  plate  versus  temperature  variation. 


Figures  5-6  show  the  linear  and  nonlinear  relationships  between  the  plate  deflection 
and  the  control  voltage.  This  voltage  induced  action  can  be  used  to  counteract  the 
deflections  induced  by  the  mechanical  or  temperature  loadings,  e.g.,  Figure  7.  Besides,  the 
PZT  induced  control  action  is  superior  to  the  PVDF  induced  control  action.  (This  can  be 
easily  inferred  from  the  inherent  piezoelectric  constants;  esi  =  10.43  C/m2  for  PZT  while 
031  =  9.6  X  10‘3  C/m2  for  PVDF.) 


Case— 2: 

Control  of  Nonlinear  Piezoelectric  Laminated  Beams 

It  is  assumed  that  a  simply  supported  three  layer  PZT/steel/PZT  beam  with 
dimensions:  width  b  =  0.0508m,  length  L  =  Im,  steel  thickness  he  =  0.00635m,  and  lead 
zirconate  titanate  (PZT)  thickness  hp  =  254xl0-6m  is  used  in  the  case  study.  Figure  8. 
Detailed  material  properties  are  summarized  in  Appendix:  Table  1  and  Table  2. 
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Fig.8  A  PZT/steel/PZT  laminated  beam. 


Next,  the  bending  stiffness  D  and  the  membrane  stiffness  K  can  be  respectively 
calculated  as:  1)  =  YI  +  2YpIp  +  2Ipe3iVe33  =  93.765(N.m2);  K  =YA  +  2YpAp 
+  2Ape3i2/e33  =  23.986xl08(N).  Note  that  the  values  of  YI,  2YpIp,  and  21pezi^/ezz  in  the 
PZT/steel/PZT  beam  are  80%,  18%,  and  2%  of  the  total  bending  stiffness  D,  respectively, 
and  values  of  YA,  2YpAp,  and  2Ape3iVc33  are  92.8%,  6.5%,  and  0.7%  of  the  total 
membrane  stiffness  K.  It  is  assumed  that  applied  control  voltages  (pz^  and  ^3^  are 

umformly  distributed  and  ^3^  =  —  ^35  =  (p,  and  the  temperature  rise  9  is  also  uniform 

along  the  x— axis  and  linear  variation  through  the  thickness:  6{z)  =  az  +  c,  where  a 
=  (^t— ^)/(he+2hp),  c  =  (^t+^b)/2,  6t,  is  the  top  surface  temperature  and  9^  is  the  bottom 
surface  temperature  of  the  beam.  Note  that  9^  = -9t  =  9  which  implies  that  the  total 
temperature  difference  between  the  top  and  bottom  surfaces  is  29.  Then,  the  electric 
control  bending  moment  and  the  temperature  induced  moment  Mx^  are 

Mx'^  =  0.003499(^  (N-m)  and  Mx‘  =  0.348560  (N-m)  ,  (64a,b) 
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in  which  98%  is  due  to  the  steel  and  only  2%  is  due  to  the  PZT  in  the  temperature  induced 
bending  moment.  (Recall  that  the  ceramics  are  less  sensitive  to  temperatures  as  compared 
with  steels.)  The  force  and  moment  relationship  can  be  simplified  to 


tanhv 

V 


64v  ^  £)3 


(65) 


Denoting  yi  =  (tanhv/ v  —  l/cosh^v)  and  y2  =  64v4i)V[-^(Mx‘^+Mx*')^L^],  one  can  plot 
yi(v)  and  y2(v).  Intersections  of  yi(v)  and  y2(v)  gives  solutions  v  of  Eq.(65),  such  as  shown 
in  Figures  9  to  11.  Then,  the  axial  force  Nx,s  and  the  beam  center  deflection  (at  x  =  L/2) 
can  be  calculated  and  its  temperature/ control  effects  studied. 


(66) 

(67) 


dimensionless  quantity  v 

Fig.9  Solution  v  for  various  control  voltages  (temperature  6  —  10’  C). 
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dimensionless  quantity  v 

Fig.  10  Solution  v  for  various  temperatures  (voltage  (p  =  lOOV). 


dimensionless  quantity  v 

Fig.ll  Solution  v  for  various  beam  lengths  (0  =  10*  C  and  ip  =  lOOV). 


Static  deflections  of  the  beam  center  (x  =  L/2)  with  respect  to  the  applied  control 
voltage  (at  0  =  10"  C),  temperature  rise  (at  <f>  =  lOOV),  and  beam  length  (with  (p  =  lOOV,  0 
=  10'  C)  are  plotted  in  Figures  12—14.  Note  that  the  10'  C  temperature  represents  a  total 
of  20' C  difference  between  the  top  and  bottom  surfaces.  The  deflection  and  voltage 
relation,  Figure  12,  gives  a  general  guideline  that  the  control  voltage  induced  displacement 
can  be  used  to  compensate  the  temperature  induced  deflection  or  the  nonlinear  deflection. 
Equivalent  axial  force  with  respect  to  the  beam  center  deflection  is  presented  next. 
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1 


3.5 


Length  of  the  Beam  (m) 


Fig.l4  Static  deflections  for  various  beam  lengths  (0=  10*  C  and  ip  =  lOOV). 

Figure  15  shows  the  axial  force  versus  the  static  deflections  of  the  beam  center, 
which  reveals  that  the  induced  axial  control  force  stiffens  the  beam  and  consequently  the 
natural  frequencies  of  the  beam  increase.  (Note  that  this  force  can  also  be  viewed  as  an 
axial  control  force.)  The  frequency  increase  can  be  expressed  by  the  quantity 
{[1+  (Nx,s  l}xl00  percent,  where  i  is  the  mode  number,  and  the  results  are 

shown  in  Figures  16—18.  The  percentage  of  variation  for  the  first  mode  is  higher  than  those 
of  the  higher  modes.  The  numerical  results  suggest  that  both  static  deflection  and  dynamic 
behaviors  of  the  simply  supported  nonlinear  PZT/steel/PZT  laminated  beam  are 
influenced  by  the  temperature  and  they  also  can  be  controlled  by  the  control  voltages 
applied  to  the  piezoelectric  actuators. 
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Transverse  Displacements  of  Beam  Middle  Point  (mm) 


Fig.15  Axial  forces  v.s.  beam  deflections. 


Voltage  applied  {V) 

Fig.l6  Frequency  variations  v.s.  control  voltages  {0  =  10*  C). 
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Fig.l7  Frequency  variations  v.s.  temperatures  (^  =  100  V). 


Length  of  the  beam  (m) 

Fig.18  Frequency  variations  v.s.  beam  lengths  =  10*C,  ^  =  lOOV). 
SUMMARY  AND  CONCLUSIONS 

Beam  and  plate  like  structures  and  components  are  widely  used  in  many  aerospace 
structures.  Imposed  shape  changes  and  surface  control  of  flexible  beams  and  plates  could 
offer  many  aerodynamic  advantages  in  flight  maneuverability  and  precision  control. 
Deformed  shapes  and  surfaces  often  involve  nonlinear  deformations.  Studies  of  control  of 
nonlinear  behavior  related  to  the  deformed  surfaces  and  shape  changes  would  provide 
detailed  information  in  future  controlled  surface  design  and  implementation.  This  research 
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is  concerned  with  the  control  effectiveness  of  nonlinearly  deformed  beams  and  plates  based 
on  the  smart  structures  technology. 

In  the  recent  development  of  smart  structures  and  structronic  systems,  piezoelectric 
materials  are  widely  used  as  sensors  and  actuators  in  sensing,  actuation,  and  control 
applications.  This  research  is  to  investigate  the  control  effectiveness  of  piezoelectric 
laminated  nonlinear  flexible  beams  and  plates  (rectangular  and  square  plates)  subjected  to 
mechanical  and  temperature  excitations.  It  is  assumed  that  the  flexible  beams  and  plates 
encounter  the  von  Karman  type  geometrical  nonlinearity.  Thermoelectromechanical 
equations  and  boundary  conditions  including  elastic,  temperature,  and  piezoelectric 
couplings  will  be  formulated  first,  and  analytical  solutions  be  derived.  The  reduced 
nonlinear  static  equations  are  identical  to  the  classic  nonlinear  plate  and  beam  equations. 
Dynamics,  electromechanical  couplings,  and  control  of  nonlinear  piezoelectric  laminated 
beams  and  plates  with  large  deformations  are  investigated.  Active  control  of  nonlinear 
flexible  deflections,  thermal  deformations,  and  natural  frequencies  using  distributed 
piezoelectric  actuators  are  studied,  and  their  nonlinear  effects  are  evaluated. 

Nonlinear  static  deflections  with  the  influence  of  temperature  and  control  voltage 
were  studied.  Voltage— temperature  and  displacement  relations  for  piezoelectric  laminated 
nonlinear  plates  and  beams  were  investigated.  The  voltage  imposed  actuations  can  be  used 
to  compensate  the  nonlinear  deformation  and  the  temperature  induced  deformation. 
Small— amplitude  beam  oscillations  with  respect  to  the  nonlinearly  deformed  static 
equilibrium  position  were  investigated.  It  was  observed  that  the  total  bending  stiffness  of 
the  PZT/steel/PZT  laminated  beam  is  contributed  by  the  steel  (80%)  and  PZT  (elasticity: 
18%  and  piezoelectricity:  2%),  and  the  total  membrane  stiffness  is  contributed  by  the  steel 
(98%)  and  PZT  (elasticity:  6.5%  and  piezoelectricity:  0.7%)  in  the  laminated  beam.  The 
piezoelectricity  contributed  stiffness  is  relatively  insignificant.  Simulation  results  also 
suggested  that  the  voltage  induced  control  displacement /force  can  be  used  to  compensate 
the  nonlinear  static  deflection,  temperature  effects,  and  natural  frequencies  of  the 
piezoelectric  laminated  beam. 
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APPENDIX:  MATERIAL  PROPERTIES 


Table  1  PZT  material  properties. 


Young’s  modulus 

Yx  =  Yy  =  Yz  =  61  GPa 

Shear  modulus 

Gxy  =  Gxz  =  Gyz  =  23.64  GPa 

Poisson’s  ratio 

(j,  =  0.29 

Density 

p  =  7.7x103  kg/m3 

Thermal  expansion  coefficient 

a=  1.2x10-6  m/m/”C 

Thermal  stress  coefficient 

Ap  =  1.03x105  N/m2/“C 

Electric  permittivity 

£33  =  1.65x10-8  F/m 

Piezoelectric  constant 

d3i=  171x10-12  C/N  (m/V) 

031  =  10.43  C/m2 

Pyroelectric  constant 

P3  =  0.25x10-4  C/m2/'C 

Table  2  Steel  mat^al  properties. 


Young’s  modulus 

Yx  =  Yy  =  Yz  =  68.95  GPa 

Shear  modulus 

Gxy  =  Gxz  =  Gyz  =  26.52  GPa 

Poisson’s  ratio 

p  =  0.30 

Density 

p  =  7.75x103  kg/m3 

Thermal  expansion  coefficient 

a  =  1.1x10-5  m/m/°C 

Thermal  stress  coefficient 

A  =  1.08x106  N/m2/'’C 

(Rdl-Rept97.Wp/Fil97.t3bl) 
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A  Progressive  Refinement  Approach  to  Planning  and  Scheduling 


William  J.  Wolfe 

Associate  Professor  of  Computer  Science 
Department  of  Computer  Science  and  Engineering 
University  of  Colorado  at  Denver 

Abstract 


A  summary  of  the  algorithmic  results  presented  in  this  report:  1.  We  introduce  a  "novel"  scheduling 
problem  called  the  window-constrained  packing  problem  (WCP),  and  we  provide  extensive  simulation 
results;  2.  We  describe  the  "0-filter"  approach,  a  creative  way  to  deal  with  multiple  sorting  features.  3. 
The  "priority  dispatcher"  that  we  invented  uses  three  phases:  i.  selection/sort;  ii.  allocation;  iii. 
optimization.  This  simple  modular  design  can  be  modified  in  a  variety  of  ways,  making  it  easy  to  adapt  to 
specific  applications;  4.  The  "look-ahead"  algorithm  that  we  invented  uses  the  dispatcher  to  look  ahead  in 
the  list  of  jobs  to  determine  a  good  choice  of  allocation  options;  5.  The  "augmented  genetic"  algorithm 
that  we  invented  has  the  potential  to  produce  near-optimal  results,  when  time  permits. 

There  are  so  many  applications  that  require  advanced  planning  and  scheduling  techniques  that  we  can  only 
provide  a  partial  listing,  such  as:  Transportation  Systems  (Airlines,  Trucking,  Logistics,  Buses,  Highways, 
etc.);  Facilities  Management  (Hospitals,  Courts,  Schools,  etc.);  Communications  Systems  (Telephones, 
Cable,  etc.);  Factories;  Spacecraft  (Earth  Observing,  Deep  Space,  Communications,  etc.).  We  identified 
three  factors  for  evaluating  the  many  algorithmic  approaches:  sp>eed,  simplicity,  and  accuracy.  “Speed” 
refers  to  the  time  it  takes  an  algorithm  to  process  and  return  its  results.  “Simplicity”  refers  to  the  ease  by 
which  users  are  able  to  understand  the  algorithm’s  logic.  “Accuracy”  refers  to  the  quality  of  the  algorithm’s 
results  as  measured  by  some  criteria  (i.e.:  an  objective  function).  The  complexity  of  these  environments 
pushed  us  toward  fast  and  simple  algorithms,  such  as  dispatch  methods.  For  increased  accuracy  we 
developed  a  look  ahead  method  (relatively  fast,  relatively  simple,  and  very  accurate).  If  the  accuracy 
requirements  are  severe  (i.e.:  need  near-optimal  solutions)  then  simplicity  and  speed  can  be  sacrificed  in 
favor  of  more  complex  and  time  consuming  methods  (e.g.:  simulated  annealing,  genetic,  tabu  search,  etc.). 
We  have  concluded  that  look-ahead  approaches,  and  variations  on  that  theme,  are  very  often  the  best 
compromise,  usually  providing  high  quality  results  within  reasonable  run  time  limits,  while  also  being 
relatively  easy  to  understand.  It  is  very  difficult,  and  in  many  cases  impossible,  to  define  a  reliable 
objective  function.  Therefore  in  most  cases  there  is  no  easy  way  of  comparing  algorithmic  results  other 
than  to  say  that  one  performed  better  with  respect  to  that  criteria.  This  observation  pointed  us  toward  fast 
and  simple  methods,  since  more  complex  methods  are,  more  or  less,  using  a  lot  of  processing  time  to  fine- 
tune  the  results  according  to  a  presumed  objective  function.  Thus,  there  are  at  least  three  reasons  why  the 
pursuit  of  accuracy  is  questionable:  1.  accuracy  is  often  defined  by  a  subjective  weighting  of  multiple, 
disparate,  criteria  (weighing  apples  and  oranges)',  2.  the  time  between  planning  and  implementation  is 
often  long  enough  for  many  of  the  assumptions  the  plan  is  based  on  to  become  invalid;  and  3.  computing 
accurate  solutions  often  takes  a  prohibitive  amount  of  processing  time.  There  is  little  point  in  pursuing 
accuracy  at  the  expense  of  speed  and  simplicity  in  a  rapidly  changing  environment.  Highly  accurate 
algorithmic  results  can  be  made  obsolete  by  small  changes  in  the  constraints.  The  constraints  themselves 
are  often  vague  or  imprecise.  Additionally,  we  do  not  want  a  high  speed  scheduler  that  reacts  to  every  little 
change,  constantly  moving  Jobs  around.  The  scheduling  process  could  become  "unstable":  rapidly  changing 
commitments  (confused  customers  and  operators).  These  considerations  led  us  to  incorporate  a  "fuzzy" 
approach.  In  fact,  we  have  tentatively  concluded  that  highly  complex,  dynamic,  domains  with  multiple, 
possibly  vague,  criteria  are  prime  application  areas  for  the  fuzzy  logic  approach.  To  add  further  context  to 
these  algorithmic  results  we  also  explored  a  hierarchical  approach  called  "progressive  refinement".  It  takes 
into  account  a  rolling  horizon  of  out-day  schedules  that  are  continuously  adapted  until  operational  time. 
This  approach  helps  to  smooth  over  many  of  the  de-stabilizing  inputs  (e.g.:  machine  failure,  canceled 
orders,  new  orders,  price  changes,  etc.). 
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1.  Introduction 


Th  e  development  of  automated  scheduling  systems  continues  to  be  an  area  of  great  interest  in  industry  and 
academia.  It  involves  the  computer  modeling  of  jobs,  resources,  and  constraints,  as  well  as  the  design  of 
algorithms  and  other  systems  that  would  help  human  schedulers  do  their  jobs.  There  are  so  many 
applications  that  require  advanced  planning  and  scheduling  techniques  that  we  can  only  provide  a  partial 
listing,  such  as: 

Transportation  Systems  (Airlines,  Trucking,  Logistics,  Buses,  Highways,  etc.); 

Facilities  Management  (Hospitals,  Courts,  Schools,  etc.); 

Communications  Systems  (Telephones,  Cable,  etc.); 

Factories; 

Spacecraft  (Earth  Observing,  Hubble  Space  Telescope,  Deep  Space,  Communications,  etc.). 

We  identified  three  factors  for  evaluating  the  many  algorithmic  approaches:  speed,  simplicity,  and  accuracy. 
“Speed”  refers  to  the  time  it  takes  an  algorithm  to  process  and  return  its  results.  “Simplicity”  refers  to  the 
ease  by  which  users  are  able  to  understand  the  algorithm’s  logic.  “Accuracy”  refers  to  the  quality  of  the 
algorithm’s  results  as  measured  by  some  criteria  (i.e.:  an  objective  function). 

The  application  areas  mentioned  above  are  very  complex  and  highly  dynamic.  The  complexity  of  these 
environments  pushed  us  toward  fast  and  simple  algorithms,  such  as  dispatch  methods.  For  increased 
accuracy  we  developed  a  particular /oo/:  a/icar/  method  (relatively  fast,  relatively  simple,  and  very  accurate). 
If  the  accuracy  requirements  are  severe  (i.e.:  need  near-optimal  solutions)  then  simplicity  and  speed  can  be 
sacrificed  in  favor  of  more  complex  and  time  consuming  methods  (e.g.:  simulated  annealing,  genetic,  tabu 
search,  etc.).  We  have  concluded  that  look-ahead  approaches,  and  variations  on  that  theme,  are  very  often 
the  best  compromise,  usually  providing  high  quality  results  within  reasonable  run  time  limits,  while  also 
being  relatively  easy  to  understand. 

Because  of  the  complexity  of  these  domains  it  is  very  difficult,  and  in  many  cases  impossible,  to  define  a 
reliable  objective  function.  Therefore  in  most  cases  there  is  no  easy  way  of  comparing  algorithmic  results 
other  than  to  say  that  one  performed  better  with  respect  to  that  criteria.  This  observation  also  points  us 
toward  fast  and  simple  methods,  since  more  complex  methods  are,  more  or  less,  using  a  lot  of  processing 
time  to  fine-tune  the  results  according  to  a  presumed  objective  function. 

Thus,  there  are  at  least  three  reasons  why  the  pursuit  of  "accuracy"  is  questionable:  1.  accuracy  is  often 
defined  by  a  subjective  weighting  of  multiple,  disparate,  criteria  (weighing  apples  and  oranges)-,  2.  the 
time  between  planning  and  implementation  is  often  long  enough  for  many  of  the  assumptions  the  plan  is 
based  on  to  become  invalid;  and  3.  computing  accurate  solutions  often  takes  a  prohibitive  amount  of 
processing  time.  There  is  little  point  in  pursuing  accuracy  at  the  expense  of  speed  and  simplicity  in  an 
rapidly  changing  environment  (e.g.:  rapidly  changing  goals,  priorities,  cost  factors,  machine  availability, 
etc.).  Highly  accurate  algorithmic  results  can  be  made  obsolete  by  small  changes  in  the  constraints.  The 
constraints  themselves  are  often  vague  or  imprecise,  further  discouraging  the  fine-tuning  of  algorithmic 
results. 

Additionally,  we  do  not  want  a  high  speed  scheduler  that  reacts  to  every  little  change,  constantly  moving 
Jobs  around.  The  scheduling  process  could  become  "unstable":  rapidly  changing  commitments  (confused 
customers,  suppliers,  and  workers).  These  considerations  led  us  to  incorporate  a  "fuzzy"  approach  (section 
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6).  In  fact,  we  have  tentatively  concluded  that  highly  complex,  dynamic,  domains  with  multiple,  possibly 
vague,  criteria  are  prime  application  areas  for  the  fuzzy  logic  approach. 

To  add  further  context  to  these  algorithmic  results  we  also  explored  a  hierarchical  approach  called 
"progressive  refinement".  It  takes  into  account  a  rolling  horizon  of  out-day  schedules  that  are  continuously 
adapted  until  operational  time.  This  approach  helps  to  smooth  over  many  of  the  de-stabilizing  inputs  (e.g.; 
machine  failure,  canceled  orders,  new  orders,  price  changes,  etc.).  Although  many  of  the  simulations 
presented  in  this  report  are  based  on  a  "batch"  approach  (i.e.:  dispatch  algorithms)  it  is  clear  that  such 
methods  must  be  integrated  into  progressive  refinement,  or  incremental,  methods  in  most  applications. 

The  algorithms  that  we  discuss  in  this  report  can  be  classified  as  follows. 


Algorithm  Classifications 

Optimal: 

Heuristic: 

Depth  First 

Constructive  Heuristics: 

(exhaustive  search) 

Priority  Dispatcher 

Look  Ahead 

Branch  and  Bound 

Fuzzy  Logic 

Linear  Programming 

Improvement  Heuristics: 

Hill  Climbing 

Simulated  Annealing 

Tabu  Search 

Repair  Heuristics: 

Iterative  Refinement 

Genetic: 

Indirect  Representation 
Augmented  Genetic. 

Optimal  algorithms  are  guaranteed  to  find  optimal  solutions,  but  they  usually  take  a  prohibitive  amount  of 
time.  The  constructive  heuristics  begin  with  a  batch  of  jobs  and  a  clean  slate,  and  schedule  the  jobs  one  by 
one  (with  limited  backtracking)  until  a  complete  schedule  is  achieved.  The  improvement  heuristics  begin 
with  a  schedule  and  find  ways  to  improve  the  quality.  A  constructive  heuristic  is  often  used  to  create  a 
schedule  that  is  then  used  as  the  starting  point  for  an  improvement  heuristic. 

The  repair  heuristics  apply  when  a  sudden  "disturbance"  (e.g.:  machine  failure,  canceled  order,  etc.)  has  been 
introduced  and  a  quick  fix  is  needed.  For  example,  the  schedule  might  be  amended  in  such  a  way  as  to  cause 
the  least  amount  of  change  to  existing  commitments.  There  are  many  variations  on  this  theme.  For 
example,  a  "schedule"  might  have  many  constraint  violations  (e.g.:  overbooking)  while  it  is  going  through 
a  process  of  incremental  improvement.  Some  authors  call  this  "iterative  refinement"  [3].  One  distinction 
that  could  be  made  is  whether  or  not  the  momentary  "schedules"  are  feasible’  or  can  contain  constraint 
violations.  Until  we  get  to  the  section  on  progressive  refinement  we  assume  that  a  "schedule"  has  no 
constraint  violations. 

The  genetic  approach  is  in  a  class  by  itself.  It  works  by  having  a  population  of  schedules  and  breeds  new 
schedules  from  "pieces"  of  the  best  schedules  in  the  current  population. 
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"Feasible"  means  that  the  schedule  can  be  implemented  (but  it  may  be  very  inefficient). 
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We  have  found  that  we  can  begin  with  a  particular  heuristic  approach  and  then  create  a  variety  of  hybrids 
that  use  parts  and  pieces  from  any  or  all  of  the  above  methods.  Thus,  it  is  sometimes  pointless  to  compare 
the  individual  methods  since  most  applications  will  require  a  unique  hybrid  approach.  We  worked  at 
understanding  each  method  well  enough  to  be  able  to  quickly  evaluate  the  cost-benefit  trade  off  of  including 
some  part  of  the  method  in  a  hybrid  approach  to  a  real  problem. 

There  are  several  issues  that  come  up  in  one  form  or  another  throughout  this  report:  algorithm  processing 
time;  algorithm  rationale  (i.e.:  simplicity);  multiple,  conflicting,  criteria;  constraint  propagation  and 
relaxation;  bottlenecks  and  other  features  of  the  job  conflicts;  general  rules  of  thumb;  uncertainty  vs. 
complexity;  dynamic  environments;  scheduling  horizons;  stability;  progressive  refinement. 


Overview  of  Planning  and  Scheduling  concepts:  From  allocating  machines  to  jobs  in  a  factory, 
to  making  lane  changes  on  a  highway,  a  wide  variety  of  engineering  problems  involve  some  form  of 
planning  or  scheduling:  the  allocation  of  resources  to  jobs  over  time.  The  term  "planning"  usually  relates 
to  higher  level  issues,  such  as  the  overall  purpose,  goals  and  strategies  of  an  endeavor,  while  the  term 
"scheduling"  usually  relates  to  lower  level  issues,  such  as  specific  equipment  and  operation  start/stop  times. 
In  the  simplest  possible  terms,  planning  deals  with  "what"  and  scheduling  with  "when".  This  distinction  is 
usually  minor  and  most  authors,  like  us,  use  the  terms  interchangeably,  but  scheduling  strongly  connotes 
the  specific  assignment  of  activities  to  time  lines,  which  is  the  main  topic  of  this  report. 


There  are  several  issues  that  make  automated  scheduling  difficult,  the  primary  ones  being: 

•  Ambiguity:  vague  or  poorly  defined  terms,  criteria,  constraints  or  specifications. 

•  Modeling:  it  is  difficult  to  build  efficient,  high  fidelity,  models  of  the  domain. 

•  Uncertainty:  unexpected  events  and  inherently  random  processes. 

•  Combinatorial  Explosion:  a  huge  number  of  alternatives  and  options. 

•  Dynamics:  rapidly  changing  goals,  resources,  costs,  requirements,  etc. 

Ambiguous  criteria  make  it  impossible  to  objectively  tell  when  one  schedule  is  better  than  another.  For 
example,  criteria  such  as  "customer  satisfaction"  or  "high  quality  product",  are  commonly  expressed  but  are 
also  difficult  to  quantify.  They  summarize  a  complex  synthesis  of  several  factors  (possibly  including  such 
amorphous  factors  as  the  customer's  perceptions  of  quality,  etc.).  The  criteria  can  sometimes  be  clarified  by 
identifying  the  contributing  components,  such  as  inventory  cost,  #  of  late  jobs,  machine  preferences, 
average  job  flow  time,  etc.  But  individual  components  can  conflict,  for  example:  low  cost  and  on  time 
usually  conflict  with  high  quality^.  And  it  may  be  very  difficult  to  agree  on  weighting  factors  that  might 
quantify  the  relative  merits  of  trading  off  one  criteria  in  favor  of  another  (e.g.:  is  a  little  lateness  worth  a 
small  increase  in  quality).  The  result  of  a  successful  analysis  of  the  relevant  criteria  is  a  clearly  specified 
objective  function.  The  function  could  then  be  used  to  reliably  distinguish  not  only  that  one  schedule  is 
better  than  another  but  also  by  how  much  it  is  better.  Unfortunately,  more  often  than  not  such  an  objective 
function  is  either  a  gross  over  simplification  of  the  problem  or  based  on  several  subjective  parameters. 

Criteria  specifications  can  be  related  to  at  least  these  three  broad  categories:  1.  Cost  and  profit  factors,  2. 
Managerial  policy  decisions;  3.  Customer  preferences.  Policy  decisions  and  customer  preferences  are 
related  to  profit,  but  in  a  long  term  sense.  Clearly,  one  major  difficulty  in  quantifying  the  criteria  is  that  it 
may  change  suddenly  (market  fluctuations,  manager  decisions,  customer  preferences,  etc.). 


2  This  reminds  me  of  the  sign  that  a  student  said  was  posted  on  a  production  manager's  desk.  It  listed  the 
following  three  items:  "1.  ON  TIME;  2.  HIGH  QUALITY;  3.  CHEAP";  under  which  was  the  advice:  "PICK  ANY 
TWO". 
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Constraints  must  also  be  clearly  specified.  There  are  at  least  three  types  of  constraints:  1 .  Criteria-related; 
2.  Physical;  3.  Both.  Criteria  specifications  are  sometimes  interpreted  as  constraints.  Take  for  example: 
"get  all  the  jobs  done  on  time".  This  is  not  a  physical  constraint,  since  it  can  be  violated,  but  at  a  cost. 
Physical  constraints  are  usually  the  kind  that  cannot  be  violated,  such  as  size,  shape,  weight,  or  rate 
limitations  related  to  the  use  of  a  particular  piece  of  equipment.  Capacity  limits,  exclusive  use,  precedence, 
and  maximum  number  of  shared  users  are  also  examples  of  physical  constraints. 

Some  constraints  are  just  preferences  related  to  physical  attributes  of  a  product,  such  as  color  or  size. 
Another  way  of  classifying  constraints  is  to  distinguish  hard  from  soft  constraints.  A  hard  constraint 
cannot  be  violated  but  a  soft  constraint  has  degrees  of  acceptability.  For  example,  a  soft  constraint  might 
be  expressed  as:  the  customer  wants  green,  but  blue  is  acceptable,  and  other  colors  are  less  acceptable.  But 
the  fact  that  the  product  must  be  painted  might  be  a  hard  constraint.  Now,  it  is  also  clear  that  when  the 
boss  says  that  there  are  to  be  no  late  jobs,  then  for  all  intents  and  purposes  this  is  a  hard  constraint.  The 
important  distinction,  however,  is  the  difference  between  relaxable  and  non-relaxable  constraints.  For 
example,  when  the  boss  concludes  that  it  is  not  possible  for  all  the  jobs  to  be  on  time,  the  due  date 
constraint  might  be  relaxed  to  allow  jobs  to  be  up  to  1  day  late.  But  the  boss  cannot  dictate  the  violation 
of  the  laws  of  physics  (e.g.:  equipment  operational  limits).  The  variations  along  these  lines  are  almost 
endless. 


Figure  A:  There  are  several  ways  to  classify  constraints. 

A  last  example:  if  there  are  only  5  machines  in  a  shop  then  a  valid  schedule  would  use  5  or  less  machines; 
but  that  does  not  prevent  the  scheduler  from  creating  a  hypothetical  schedule  that  uses  6  machines;  that  is, 
the  hard  constraint  is  relaxed  to  do  a  "what  if  analysis  that  evaluates  the  benefit  of  adding  a  machine  to  the 
shop. 

The  distinction  between  hard  and  soft  constraints,  and  the  means  by  which  constraints  are  relaxed,  plays  a 
big  role  in  most  scheduling  processes.  For  example,  constraints  might  have  relaxation  levels:  when  it  is 
discovered  that  a  feasible  schedule  can  not  be  found  all  constraints  of  level  I  are  relaxed  (made  less  strict) 
and  the  search  for  a  feasible  schedule  resumed;  if  that  fails  all  the  constraints  of  level  2  are  relaxed,  etc. 
Finally,  we  note  that  there  are  other  ways  to  classify  constraints^,  such  as  discrete  vs.  continuous,  and 
deterministic  vs.  probabilistic. 


An  example  just  came  up:  an  author  called  all  the  system  constraints  the  implicit  constraints  and  all  the 
customer  preferences  the  explicit  constraints. 
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Figure  B:  The  criteria  and  constraints  specify  the  quality  of  the  scheduling  options. 

Computer  algorithms  work  with  representations  of  the  domain,  or  in  other  words,  with  computer  models. 
These  models  attempt  to  capture  the  critical  features  of  the  criteria,  resources,  and  constraints,  so  that  the 
algorithms  can  efficiently  evaluate  the  scheduling  options.  Intuitively,  the  hard  constraints  define  the 
boundaries  of  the  search  space  while  the  soft  constraints  define  the  shape  of  a  surface  that  stretches  between 
the  boundaries  (i.e.:  the  hills  and  valleys  of  the  objective  function).  The  sketch  in  the  figure  is  a 
simplification  in  several  ways:  1.  dimensionality  of  a  realistic  scheduling  problem  is  much  higher;  2.  there 
are  many  non-linear  interactions  that  can  occur  between  suitability  functions;  and  3.  the  sketch  overlooks 
the  fact  that  suitability  functions  might  relate  to  complex  interactions  among  several  constraints  (e.g.:  set 
up  time,  integrated  maintenance  schedule,  etc.).  But  the  sketch  captures  the  basic  idea:  a  complicated 
surface  that  will  not  be  easy  to  navigate. 
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Figure  C:  Many  dynamic  factors  come  into  play  in  a  realistic  scheduling  problem. 


Computer  models  are  almost  always  incomplete  because  of  domain  complexity  and  limits  to  what  can  be 
effectively  represented  in  data  structures.  A  major  contributor  to  complexity  is  uncertainty.  An  unexpected 
event  can  render  a  well  thought  out  schedule  useless,  and  at  the  extreme,  if  there  are  lots  of  unexpected 
events  then  scheduling  becomes  a  futile  activity.  For  example,  in  a  factory  where  machines  are  constantly 
breaking  down  there  is  really  no  point  to  spending  vast  amounts  of  time  refining  the  schedule.  The 
assignment  of  resources  would  probably  be  based  on  a  simple  rule,  such  as:  put  the  most  important  job  on 
the  first  available  machine  (until  it  breaks  down).  Equipment  failures  are  just  one  category  of  unexpected 
events,  others  include  changes  in  organizational  policy,  changes  in  customer  priorities,  and  changes  in  due 
dates.  Schedules  made  according  to  different  policies  might  vary  greatly,  and  if  policy  changes  are  frequent 
and  far  reaching,  again,  the  effectiveness  of  detailed  scheduling  is  significantly  reduced. 

There  are  domains  where  the  degree  of  uncertainty  is  very  small,  but  even  then  there  is  still  the  issue  of 
modeling  a  resource  accurately,  which  might  require  large  data  structures  or  a  large  number  of  complex 
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differential  equations.  The  fidelity  of  the  models  may  never  reach  the  level  of  accuracy  that  is  required,  or 
the  computational  efficiency  may  be  poor. 


Figure  D:  Progressive  refinement  adds  the  details  as  the  day  of  operation  approaches. 

To  get  a  handle  on  the  problems  associated  with  changing  criteria,  modeling  uncertainty  and  the  huge 
number  of  scheduling  options,  certain  scheduling  strategies  are  generally  recommended,  especially 
progressive  refinement.  Progressive  refinement  begins  with  a  rough  out  day  schedule,  and  progressively 
refines  it  until  it  is  a  valid  schedule  at  day  0  (i.e.:  operations)  This  opens  up  a  variety  of  hierarchical 
possibilities.  The  further  the  out  day  the  more  crude  the  "schedule"  can  be,  that  is,  all  the  hard  constraints 
do  not  have  to  be  satisfied;  for  example,  there  may  be  some  over  booking,  an  unspecified  communication 
link,  etc.  As  the  near  day  approaches,  details  can  be  added  and  adjustments  made,  while  simultaneously 
accounting  for  unexpected  events  (newly  arriving  high  priority  orders,  equipment  failure,  etc.).  The 
schedule  may  be  modified  right  up  to  the  time  of  operation.  This  is  also  typical  of  how  business  trips,  and 
vacations  are  planned.  Three  principles  related  to  progressive  refinement  emerge: 

•  Priority  Aging:  in  a  distributed  environment,  when  a  task  is  scheduled  a  commitment  is  made  that 
can  spawn  many  contingent  plans,  possibly  reaching  out  to  other  organizations  beyond  the  domain 
of  the  current  schedule.  If  the  task  is  suddenly  bumped,  or  moved,  the  effect  can  propagate  in 
unexpected  and  undesirable  ways.  Intuitively,  the  task  grows  roots  as  it  sits  on  the  schedule 
waiting  for  execution.  It  is  therefore  wise  to  include  a  factor  that  raises  the  effective  priority  of  an 
incumbent  task  the  longer  it  sits  on  the  schedule,  making  it  more  and  more  difficult  to  move  or 
bump. 

•  Stability:  this  concept  is  related  to  priority  aging,  but  refers  to  the  degree  of  task  movement  we  are 
willing  to  tolerate  to  incorporate  a  new  task,  or  to  account  for  a  machine  failure,  etc.  It  may  be 
possible  to  get  a  new  task  on  the  schedule  without  bumping  (i.e.:  off  the.  schedule)  any  task,  but 
several  tasks  may  have  to  be  modified.  It  is  not  wise  to  be  continually  shuffling  several  tasks 
around  (i.e.:  avoid  nervous  scheduling)  since,  for  example,  a  subtle  system  safety  factor  might  be 
overlooked,  and  the  humans  who  interact  with  the  system  might  become  thoroughly  confused. 
Therefore  a  limit  should  be  put  on  how  many  tasks  will  be  moved  at  any  one  time. 

•  Opportunistic  Planning:  unexpected  changes  usually  have  an  adverse  effect  on  the  quality  of  a 
schedule,  but  in  some  situations  a  change  creates  previously  unconsidered  opportunities.  Be  on  the 
look  out  for  such  free  opportunities,  and  capitalize  on  them. 

The  results  of  a  successful  modeling  phase  is  a  recognition  of  the  degree  of  unavoidable  uncertainty,  and  an 
identification  of  the  critical  parameters  that  can  be  used  to  represent  the  resources  efficiently.  If  the 
objective  function  were  well  defined,  and  the  models  were  accurate"^,  the  algorithms  would  be  able  to  explore 
a  well  defined  search  space  of  scheduling  options.  The  classical  literature  of  Operations  Research  (OR) 
thrives  on  such  cases,  but  the  number  of  constraints  and  options  can  be  so  huge  that  the  number  of 
variables  exceeds  practical  limits. 


4 


That  is,  if  a  miracle  occurs. 
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An  algorithm  has  to  be  pretty  smart  to  avoid  exploring  many  bad  options.  Effective  rules  or  heuristics 
usually  incorporate  specific  details  of  the  particular  domain  and  consequently  such  rules  do  not  apply  to 
other  domains.  This  is  not  a  simple  matter  and  it  hits  at  the  heart  of  all  heuristic  search  strategies.  We 
will  discuss  such  heuristics  in  the  following  sections. 
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2.  Window -Constrained 
Packing  Problem 


/\fter  investigating  several  factory  scheduling  problems  we  identified  a  particular  packing  problem  that 
typifies  the  difficulties  that  arise.  We  call  it  the  "window-constrained  packing  problem"  (WCP).  This 
problem  captures  the  "essence"  of  many  scheduling  trade-offs  without  unnecessary  complexity.  Many  of  the 
results  described  in  this  report  were  first  confirmed  on  this  problem  and  then  extrapolations  were  made  to 
draw  conclusions  concerning  more  complex  problems. 


Many  scheduling  problems  boil  down  to  placing,  or  fitting,  activities  onto  time  lines^.  For  simplicity 
assume  that  there  is  one  resource  and  that  the  time  span  involved  is  the  interval  [0,  T].  In  this  ideal  model 
the  time  line  is  divided  into  discrete  steps  and  an  activity  can  stop  at  the  same  time  as  the  start  of  another 
activity. 


job  A  job  B 

job  C  job  D  job  E 

1 

1  1  1  1  1 

1 1 

ITT 

0 

tTm 1 1  1  in  1  1 

Tfri 1 iTnTrTI  1 1 

“H 

T 

Figure  1;  Discrete  time  steps;  a  job  can  start  exactly  where  another  job  stops. 


There  are  n  jobs  {jobj  I  i  =  1,  ...,  n}  and  a  common  constraint  is  that  Jobi  must  be  done  within  its  window 

of  opportunity:  [W(,j,  WfJ.  A  job  is  scheduled  by  assigning  it  a  continuous^  duration  (dj)  that  satisfies: 
dminj  <  dj  <  dmax^,  where  dmax,  and  dmin;  are  given  for  each  job.  That  is,  each  job  has  a  minimum  and 
maximum  duration.  The  scheduled  time  is  referred  to  as  an  activity. 


Figure  2:  Each  job  has  a  priority  and  must  be  scheduled  (i.e.:  become  an  activity)  within  its  window  and 

duration  constraints. 

A  typical  complication  is  that  there  can  be  preferences  for  placements  within  the  window  of  opportunity.  A 
typical  assumption  is  that  the  middle  of  the  window  is  preferred  over  the  edges,  so  schedules  that  place  jobs 


^  Classical  Knapsack  Packing  and  Job  Sequencing  problems  fall  into  this  category  [14]. 
^  This  is  known  as  the  "nonpreemptive"  assumption. 
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in  the  middle  of  their  windows  would  be  given  higher  scores.  This  preference  might  be  specified  by  a 

y 

suitability  function. ' 


Figure  3:  Several  jobs  compete  for  time  on  the  resource  within  their  windows  of  opportunity. 


Figure  4:  A  job  is  defined  by  a  minimum  (d„J  and  a  maximum  (d^  duration  that  must  be  scheduled 
within  a  window  of  opportunity.  The  score  profile  expresses  the  preference  for  placement  within  the 
window  (for  the  bell  curve  in  this  sketch  the  preference  favors  the  center  of  the  window). 

Each  job  also  has  a  priority  assigned  to  it;  1<  Pj  <10.  The  priority  measures  the  relative  worth  of  the  job, 

Q 

and  might  be  measured  in  profit-dollars  or  other  measures  of  value®. 

When  there  is  competition  for  the  resource  the  scheduler  must  determine  which  jobs  get  on  the  schedule  and 
specify  the  start  and  stop  times  that  satisfy  the  window  limits  and  duration  constraints. 

^  Sometimes  called  a  score  profile,  or  utility  function. 

^  An  example  that  jumps  to  mind  is  the  "scientific  worth"  of  a  satellite  observation  activity. 
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Putting  this  all  together  we  can  formally  define  the  window-constrained  packing  problem  (see  figure  5). 
Window-Constrained  Packing  Problem  (WCP):  Given  n  objects  (requests),  each  having; 


a  window  of  opportunity  [w^,  wj; 

a  suitability  function  with  domain  =  [w„,  Wj];  {s(t)  1 1  e  [w^,  wj}; 
a  minimum  length  (d„j„)  and  a  maximum  length  (dn,„);  and 
a  priority  p. 


Maximize: 


Where: 


Q  =  X  xi  •  pi  •  areai 
i  =  1 


stopi 

areai  =  X  si(t) 


Subject  to  the  constraints: 


^mini  —  dj  ^ 
dma.i  ^W  =  Wn-\ 


Woi  ^  startj  <  Wfi  -  di 

XjS  {0,1} 

[startp  stopi)  n  [start,,  stop,)  =  0 


i^j 


(note:  Xj  =  1  means  that  job,  is  on  the  schedule) 
(consistency) _ 


Figure  5:  The  details  of  the  Window-Constrained  Packing  Problem. 

The  WCP  arises  as  a  sub-problem  in  many  scheduling  problems  (e.g.:  job  shop,  satellites,  deliveries,  etc.). 
In  the  simplest  case  s(t)  =  1  for  all  the  job  windows,  and  area,  =  stop,  -  start,  =  d,  for  all  scheduled  jobs. 
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Job 

wO 
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/vt 

Scheduling  Options 

d  =  dmin  ' 

F=  1  1  -  1 

d  =  drr 

mx 

- 1  -  1 - n _ 1  • 

*. 

1  1  1  - 1 

• 

• 

• 

1 

1 - 1  1 _ 1 

1  1 

Figure  7:  There  are  usually  many  scheduling  options  for  each  job. 


The  WCP  has  the  constraints:  job  duration,  window  placement,  window  limits,  exclusive  use  of  the 
resource  (i.e.:  capacity  =  1),  non-preemptive  activities.  A  more  complicated  model  would  have  multiple 
resources,  multiple  windows  of  opportunity,  multiple  operations  for  a  given  job,  precedence  constraints 
among  the  jobs/operations,  etc.  In  what  follows  we  stick  with  the  simple  WCP,  but  we  have  simulated 
scenarios  with  the  other  constraints  as  well  to  verify  that  our  "extrapolations"  to  more  complex  problems 
are  reasonable.  It  is  safe  to  say,  however,  that  if  we  are  running  into  run-time  limits  on  the  simple 
problem  (WCP)  then  we  will  certainly  have  worse  run-time  problems  on  the  more  complex  problem. 

Notice  that  we  are,  for  the  moment,  assuming  that  the  only  criteria  is  Q,  which  is  a  function  of  the 
individual  priorities  and  associated  areas.  That  is,  there  is  no  score  related  to  the  other  possible  criteria, 
such  as: 


%Jobs 


%Utilization 


#  jobs  scheduled 

#  jobs  competing 


time  allocated 
total  time 


•  100 


There  are  several  other  criteria  that  might  play  a  role  in  evaluating  a  schedule  (e.g.:  job  spacing),  but  for 
now  these  are  good  examples  of  alternatives  (or  additions)  to  the  Q  defined  above.  This  will  become  more 
important  later,  when  we  bring  up  the  issue  of  multiple  criteria. 

In  the  next  sections  we  introduce  specific  scheduling  algorithms  and  then  evaluate  their  performance  on  the 
WCP.  We  will  randomly  generate  test  problems,  apply  the  algorithms  and  compare  the  resulting  scores. 

For  very  small  problem  sizes  we  can  run  an  exhaustive  search  to  determine  the  optimal  solution  and  use 
that  for  comparisons  as  well.  But  first  we  will  analyze  the  #  of  scheduling  options  that  we  might  be 
dealing  with  (i.e.:  the  complexity  of  the  problem). 

Complexity:  One  way  to  find  an  optimal  solution  is  to  exhaustively  search  the  possibilities  and 
compute  the  scores  of  all  possible  schedules.  But  this  can  be  very  time  consuming  since  the  average  job 
can  have  many  scheduling  options: 

number  of  scheduling  options  for  a  job  with  window  width  w,  and  duration  limits  d„jn,  d^jj: 
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dmax 

#  options  =  1  +  X[w  -  q  +  1] 

q  =  dmin 

=  1  +  (dmax  -  d^in  +  !)•(  w  +  1  -  (d^i„  +  d„„)/2  ); 

The  "1"  occurring  after  the  equal  sign  accounts  for  the  option  of  not  scheduling  the  job.  In  fact,  one  of  the 
most  critical  aspects  of  this  problem  is  that  we  do  not  know  ahead  of  time  which  Jobs  should  "make"  the 
schedule  and  which  ones  to  leave  off  the  schedule. 

The  largest  number  of  options  occurs  when  the  window  is  the  largest,  the  minimum  time  is  the  smallest, 
and  the  maximum  duration  is  the  largest.  Let  w  =  w^,,,  and  suppose  d„i„  =  1  and  d,„„  =  w: 

(w)-(w  +  1  -  (w  +  l)/2)  +1  =  (w)-(w+l)/2  +  1  =  (w)V2  +  w/2  +  1  =  C>(w^). 

So,  in  the  worst  case,  since  there  are  n  jobs,  there  can  be  as  many  as  w^"  total  scheduling  options. 

The  simplest  situation  is  when  d^.n  =  d^^,  =  w  for  each  job:  1  option  per  job.  But  even  this  case  can  get 
sticky  since  the  jobs  may  be  inconsistent  (i.e.:  conflict  with  each  other):  if  the  jobs  are  all  consistent  there 
is  only  1  total  scheduling  option;  if  two  jobs  overlap  than  we  must  choose  1  out  of  2,  and  so  on  if  several 
overlap. 

So,  roughly  speaking,  each  job  has  about  w  scheduling  options,  and  there  are  about  w"  possible  schedules. 
The  number  of  scheduling  options  grows  exponentially  as  the  number  of  jobs  increases  (it  also  grows  as  w 
increases,  but  at  a  slower  rate).  Some  of  these  options  will  be  eliminated  because  they  create  inconsistent 
(non-feasible)  schedules.  Another  extreme  is  when  a  job  window  does  not  overlap  any  other  job  window: 
the  only  option  that  should  then  be  considered  for  that  job  is  when  the  duration  (d^^,;)  is  placed  to  give  the 
maximum  area;  no  other  options  need  be  considered  since  they  are  guaranteed  to  produce  a  worse  overall 
schedule. 

Although  there  are  several  "special  cases"  it  is  clear  that  the  general  problem  has  an  exponentially  growing 

number  of  scheduling  options  to  consider.  Finally,  notice  that  the  classical  Knapsack  Problem^  is  the 
special  case: 

windows  are  all  the  length  of  the  full  time  interval  (i.e.:  the  "knapsack"  =  [0,T]); 
each  job  had  one  duration:  d„,„;  =  dn,i„; 

suitability  functions  are  flat  (i.e.:  no  preferences  within  a  window). 

Thus,  the  WCP  is  also  NP-complete.  It  is  believed  (no  one  has  been  able  to  provide  a  proof)  that  NP- 
complete  problems  do  not  have  polynomial  time  algorithms  that  always  produce  optimal  solutions.  This 
forces  us  to  explore  necessarily  sub-optimal  heuristic  algorithms.  Thus,  there  is  an  inherent  speed-accuracy 
trade  off. 


^  Known  to  be  NP-complete  [16]. 
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3. 


Algorithms  Overview 


Here  we  provide  an  overview  of  the  many  algorithmic  approaches  that  we  considered  (more  details  are 
provided  later).  The  organization  of  the  algorithms  into  the  following  categories  is  not  a  strict 
categorization,  it  just  provides  a  convenient  way  to  refer  to  them. 

I.  Optimal  Algorithms: 

A.  Depth  First  (DF):  Exhaustive  search  of  the  options;  guaranteed  to  find  the  optimal  solution; 
usually  takes  an  impractical  amount  of  time. 

B.  Branch  and  Bound  (BB):  Almost  an  exhaustive  search,  but  branches  that  cannot  improve  on 
the  best  result  found  so  far  are  pruned;  usually  takes  an  impractical  amount  of  time. 

II.  Constructive  Heuristics: 

A.  Priority  Dispatch  (PD):  Rank  the  jobs,  and  schedule  them  one  by  one  until  a  complete 
schedule  is  constructed;  no  backtracking  or  bumping  of  scheduled  jobs;  a  very  fast  but  sub- 
optimal  algorithm.  Despite  the  name,  we  do  not  have  to  rank  the  jobs  according  to  "priority",  we 
can  use  any  features  we  want  (simple  to  calculate)  to  rank  the  jobs. 

B.  Look  Ahead  (LA):  Proceed  as  in  the  Priority  Dispatch  but  when  placing  a  job  on  the  schedule 
score  each  of  its  options  by  (hypothetically)  dispatching  the  rest  of  queue  (i.e.:  apply  the  PD)  and 
getting  a  predicted  schedule  score;  it  is  not  optimal,  but  it  does  significantly  better  than  the 
Priority  Dispatch;  the  run  time  is  noticeably  longer  than  the  Priority  Dispatch  but  still  reasonably 
fast  (PD  =  O(nlogn);  LA  =  O(n^) ). 

C.  Fuzzy  Logic:  We  consider  fuzzy  logic  as  a  means  of  extending  the  rule-based  aspects  of  the 
PD  algorithms.  That  is,  we  "fuzzify"  the  rules  so  that  they  read: 

"when  the  PRIORITY  is  high  and  the  LAXITY  is  low  then  the  RANK  is  high". 

Priority,  Laxity  and  Rank  are  referred  to  as  "linguistic  variables",  and  they  take  on  "terms"  such  as 
low,  medium  and  high  [13].  The  fuzzy  approach  adds  a  layer  of  abstraction  to  the  rules  that  is 
helpful  when  discussing  the  nature  of  the  algorithm  in  a  non-technical  way.  Computations  are 
easily  integrated  into  the  approach  through  specific  membership  functions  and 
fuzzification/defuzzification  steps.  Although  we  have  only  just  begun  to  incorporate  fuzzy  logic 
into  the  PD,  we  anticipate  that  we  will  gain  robustness  and  flexibility,  with  little  loss  of 
simplicity  and  speed. 

III.  Improvement  Algorithms:  Start  with  a  schedule  (usually  the  output  of  one  of  the  constructive 
approaches)  and  try  to  improve  it. 

A.  Hill  Climbing  (HC):  Begin  with  a  schedule  and  consider  ways  to  change  the  parameters  (move 
a  job,  swap  two  jobs,  increase/decrease  the  duration  of  a  job,  etc.).  After  consider  several  possible 
"moves"  pick  the  one  that  improves  the  schedule  the  most,  then  iterate.  Halt  when  no  move  can 
improve  the  schedule. 
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B.  Simulated  Annealing  (SA):  The  same  as  Hill  Climbing  but  we  do  not  always  take  the  best 
move,  in  fact  we  sometimes  (probabilistic  decision)  take  a  move  that  decreases  the  quality  of  the 
schedule.  As  the  iterations  progress  we  bias  the  decisions  in  favor  of  Hill  Climbing  and 
ultimately  converge  to  pure  Hill  Climbing. 

C.  Tabu  Search  (TS):  As  we  consider  the  moves,  as  in  the  Hill  Climber,  we  keep  a  list  of  moves 
that  are  not  allowed  to  be  repeated  within  a  certain  number  of  iterations.  Like  Simulated 
Annealing  we  will  not  halt  at  local  optima,  but  unlike  Simulated  Annealing  we  maintain  a 
deterministic  search  strategy  (as  opposed  to  the  randomness  of  SA). 

D.  Repair:  These  algorithms  are  variations  on  the  previous  heuristics  but  the  assumption  is  that 
we  are  dealing  with  a  schedule  that  has  been  suddenly  rendered  incorrect  by  some  event  (e.g.: 
machine  failure,  a  canceled  order,  etc.).  This  may  boil  down  to  making  a  few  simple  adjustments 
to  the  schedule  (e.g.:  slide  all  the  jobs  on  the  failed  machine  forward  to  the  next  available 
machine),  or  it  might  involve  the  re-scheduling  of  most  of  the  scheduled  jobs.  In  many  cases  the 
"stability"  of  the  schedule  comes  into  play.  The  "stability"  measures  the  number  of  changes  in  the 
schedule  over  a  short  period  of  time.  It  is  usually  undesirable  to  be  rapidly  changing  the  schedule 
(even  if  the  "score"  of  the  new  schedule  is  better  in  some  other  sense).  A  typical  strategy  would  be 
to  only  reschedule  a  few  selected  jobs,  or  the  only  the  lowest  priority  jobs. 

There  are  a  couple  of  other  terms  that  come  up  in  the  literature  that  are  something  like  "repair": 
"incremental"  and  "reactive".  The  incremental  approach  is  similar  to  the  repair  approach,  except  that  it 
implies  the  day-to-day  scheduling  process:  adjusting  jobs  to  meet  the  constraints,  fixing  an  oversubscription 
problem  here,  a  missing  resource  allocation  there,  etc.  We  bring  up  such  issues  under  the  title  "progressive 
refinement"  later  in  this  report.  Reactive  scheduling  implies  the  immediate  response  to  an  "emergency" 
situation,  like  a  "reflex  action",  and  is  thus  almost  identical  to  what  we  call  repair-based,  only  differing  in 
the  amount  of  time  and  thought  required  to  make  the  repair. 

IV.  Genetic  Algorithms:  A  "population"  of  schedules  is  allowed  to  evolve  by  breeding  new  schedules  from 
the  better  schedules  seen  so  far.  We  tested  two  particular  GA's: 

A.  Indirect  GA  (GA):  Each  member  of  the  population  is  a  permutation  of  the  list  of  jobs.  The 
members  are  evaluated  by  feeding  the  permutation  into  either  the  PD  or  LA  algorithms  (or  any 
other  constructive  method).  Feeding  the  permutations  into  the  PD  will  run  through  the 
populations  faster  than  with  the  LA,  but  the  LA  will  tend  to  give  better  results  in  fewer 
generations.  We  explored  a  few  "crossover"  operators  and  the  results  presented  below  are  with  the 
PMX  operator  [17].  In  this  representation  a  member  of  the  population  is  a  permutation,  and  the 
permutation  is  used  to  construct  a  schedule  (as  opposed  to  using  the  schedules  themselves  as 
members  of  the  population)  so  it  is  referred  to  as  an  "indirect"  representation.  More  details 
(mutations,  etc.)  are  provided  later  in  this  report. 

B.  Augmented  GA:  We  added  more  information  to  the  Indirect  GA.  Each  permutation  is 
augmented  with  O's  and  I's  to  indicate  which  jobs  are  allowed  to  be  scheduled.  That  is,  a  1 
indicates  that  the  dispatcher  should  consider  scheduling  the  job,  and  a  0  that  it  should  not.  We 
augment  the  crossover  to  pass  these  O's  and  I’s  along  with  the  generations.  This  modification 
improves  the  GA  results  significantly. 

After  running  many  of  versions  of  these  algorithms  on  the  WCP  it  became  clear  that  the  WCP  has  three 
critical  pieces: 

•  Which  jobs  make  the  schedule? 

•  What  order  should  they  be  in? 

•  What  is  the  exact  start  and  stop  time  for  each  scheduled  job? 
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These  questions  are  not  independent,  but  can  be  interpreted  in  a  hierarchical  way.  First  of  all,  if  we  knew 
ahead  of  time  which  of  the  contending  jobs  were  going  to  make  the  schedule  (a  subset  of  the  total  set  of 
competing  jobs)  then  finding  the  optimal  solution  to  the  WCP  problem  would  be  significantly  easier. 
Given  the  jobs  that  are  to  make  the  schedule,  then  if  we  knew  the  ordering  of  those  jobs  on  the  resource  the 
problem  we  be  even  easier  (the  only  thing  then  left  to  do  is  to  slip/slide/grow/shrink  the  jobs  until  we  get 

the  highest  score  we  can  get'®). 

It  is  interesting  to  see  how  each  of  our  algorithmic  methods  deals  with  these  questions.  For  example,  the 
PD  answers  the  first  question,  implicitly,  by  ranking  the  jobs,  then  answers  the  second  question  with  its 
allocation  rules,  and  finally  answers  the  third  question  by  "growing"  the  jobs  that  made  the  schedule.  It  is 
more  difficult  to  see  these  separate  issues  in  some  of  the  other  algorithms  but  we  have  found  that,  in 
general,  these  three  questions  help  simplify  the  analysis. 

Even  though  we  have  this  modular  hierarchy  of  concerns,  we  must  keep  in  mind  that  a  given  set  of  jobs  can 
compete  for  a  resource  in  a  possibly  large  number  of  ways.  Thus,  unless  we  check  all  interactions  between 
jobs  and  evaluate  all  potential  conflicts  we  cannot  be  certain  that  we  have  the  optimal  solution.  Such  is 
the  nature  of  NP-Complete  problems.  Another  way  of  saying  this  is  to  point  out  that  for  any  proposed 
heuristic  a  set  of  jobs  can  be  created  that  will  force  the  heuristic  to  miss  the  optimal  solution  (possibly  by  a 
huge  amount).  Therefore,  the  effectiveness  of  a  heuristic  depends  on  the  nature  of  the  particular  set  of 
competing  jobs. 


We  are  assuming  that  the  suitability  functions  are  simple  (e.g.:  unimodal)  and  that  a  simple  Hill  Climbing 
approach  would  be  very  effective;  if  the  functions  are  complex  then  this  step  is  much  more  difficult. 
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4. 


Optimal  Algorithms 

Depth  First 
Branch  and  Bound 


The  "optimal"  algorithms  are  guaranteed  to  produce  optimal  solutions  (according  to  a  well  defined  criteria), 
although  they  may  take  a  long  time  to  do  so.  Most  optimal  algorithms  are  impractical  in  real  applications 
but  they  often  serve  as  theoretical  tools  for  grasping  the  complexity  of  the  problem.  We  will  discuss  two 
such  algorithms*’. 

Depth  First  (DF):  A  Depth-First  search  (see  figure  8)  would  be  one  way  to  exhaustively  search  all 
possible  schedules  and  find  the  highest  scoring  one  (the  optimal  schedule).  But  such  a  search  would  be 
unnecessarily  time  consuming  in  most  cases.  The  Branch  and  Bound  algorithm  described  below  uses 
knowledge  of  the  best  schedule  so  far  to  prune  away  fruitless  branches  of  the  search. 

Branch  and  Bound  (BB):  One  way  that  we  might  eliminate  branches  is  by  somehow  knowing  that  a 
branch  cannot  possibly  give  us  a  result  better  than  one  we  already  have  (see  figures  12a  and  12b). 

Suppose  we  knew  that  there  is  at  least  one  schedule  with  a  score  of  Q  =  x.  If  we  are  constructing  a 
schedule,  job  by  Job,  we  can  use  x  to  detect  bad  branches:  when  considering  an  option,  compute  the  total 
score  that  would  be  gotten  if  all  the  remaining  requests  were  scheduled  at  their  optimal  placement 
(maximum  duration,  window  middle,  without  checking  for  conflicts  with  other  unscheduled  jobs).  If  this 
value  is  less  than  x  we  can  safely  prune  that  branch  off  the  tree  of  choices. 

To  be  more  specific:  given  that  the  jobs  are  indexed  as  usual  (jobj  I  i  =  1,  ...,  n)  and  we  are  sequencing  in  a 
depth  first  sense  through  the  job  options  in  that  order,  let: 

k-l 

Qk  - 1  =  5)  xi  ■  score: 

i  =  1 

where  score;  =  p;  •  area; 

and  xi  =  I  ^  schedule 

'  [  0  jobi  is  not  on  the  schedule 


When  we  consider  job^  we  can  score  its  possible  placements  by  computing: 
placement  score  =  qk  =  Qk-i+  scorek  +  Q„,,k .  i- 

n 

Qmaxk  +  I  =  Z  pj  •  maxareaj 

j  =  k+l 

maxareaj  =  the  largest  possible  area  achievable  by  jobj 
(usually  the  maximum  duration  placed  in  the  window  middle). 


11 


These  algorithms  might  be  called  "backtracking"  algorithms. 
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Qk.,  is  the  score  of  the  partial  schedule  up  till  the  k'"  job  is  considered,  score^  is  the  score  of  a  particular 
placement  of  job^,  and  Q^axk  + 1  is  the  maximum  possible  score  of  all  the  jobs  later  in  the  queue:  jobj,  j  >  k 
(without  considering  conflicts).  Notice  that  as  we  score  the  options  Qn  and  Q„axk  +  i  do  not  change.  We 
only  consider  options  that  do  not  conflict  with  the  previous  scheduling  decisions.  If  the  score  of  an  option 
(q^)  is  less  than  the  score  of  the  best  schedule  so  far  then  we  prune  that  option  and  its  branch. 


Leaf  N(Kk:total 
worth  >  Best  ? 


Figure  8:  Depth  First  Search.  Sort  the  requests.  Consider  all  scheduling  options  by  taking  the  top 
of  the  queue,  finding  a  scheduling  option,  then  go  on  to  the  next  request.  When  a  leaf  node  is  reached 
evaluate  the  schedule  (score)  and  see  if  it  is  the  "best  so  far".  Then  backtrack  to  other  possibilities  . 


i2  Remember  to  consider  the  option  of  not  scheduling  the  job. 
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Figure  9:  Backtracking  in  a  depth  first  search  begins  when  we  hit  the  last  job.  Then  we  try  all  the 
options  for  the  last  job,  and  when  they  are  depleted  the  search  backtracks  to  the  previous  job  and  considers 
its  next  option,  and  so  on. 


Figure  10:  Constraints  propagate  as  scheduling  decisions  are  made.  Some  jobs  are  not  affected,  and  some 
have  no  options  left. 
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Constraint  Relaxation 


Want  to  schedule  dmax,  but  it 
wont  fit.  So  relax  the  duration 


~1 


time.  1 - i - 1 

Resource  ^ 

(knix 

— .  .  .1 

1 1 1 1  n  i-n  n  1  n  M  f  1 1 1 1 1 1  M  1  M  M 

° _ _ _  relaxing  the  i - 

The  Schedule 


duration  constraint 
■  from  the 
I  maximum  to 
something  that  fits 


Figure  11:  A  simple  example  of  "constraint  relaxation".  In  this  case  job  k  wanted  to  be  scheduled  for  the 
maximum  duration,  but  the  previous  scheduling  commitments  would  conflict,  so  the  duration  is  relaxed  to 
a  length  that  fits  the  available  time.  In  general,  constraints  are  relaxed  from  higher  levels  of  satisfaction  to 
lesser  ones  as  the  need  the  arises. 
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maximum 
possible  score 
that  this  branch 
can  produce 


Figure  12a:  A  way  to  prune  some  branches  in  the  BB  algorithm.  At  each  scheduling  option  assume  that 
all  the  remaining  requests  are  scheduled  at  maximum  duration  and  at  there  highest  scoring  positions  within 
their  windows  of  opportunity  without  regard  for  conflicts.  If  the  resulting  score  is  less  than  the  highest 
scoring  scheduling  saved  so  far  then  there  is  no  point  in  pursuing  this  branch 


Figure  12b:  Branch  and  Bound:  score  each  scheduling  option  by  adding  the  score  of  the  partial  schedule 
and  the  score  of  the  option  and  the  maximum  possible  score  of  the  remaining  jobs.  If  the  score  of  an 
option  is  less  than  the  best  schedule  so  far  prune  that  branch  of  the  search  tree. 


Branch  and  Bound  might  still  have  to  search  all  the  possibilities,  it  depends  on  the  particular  jobs. 
However,  pruning  away  "bad"  options  {branches  of  the  search)  can  save  time,  without  compromising  the 
guarantee  of  optimality.  The  BB  algorithm  will  take  a  relatively  long  time  to  run  and  will  be  restricted  to 
problems  of  size  n  <  10  (feel  free  to  push  back  the  wall  of  exponential  growth).  The  requests  can  be  sorted 
at  the  start  to  make  it  easier  to  prune  paths.  Estimated  order:  C>(w"). 


27-22 


Figure  13a:  The  case  considered  here  is  =  25,  h’„,„  —  5,  T  —  200.  The  top  curve  is  the  worst  case 
number  of  scheduling  options  for  each  problem  size.  The  middle  curve  is  the  actual  number  of  scheduling 
options  for  randomly  generated  job  sets  (average  of  10  runs  per  problem  size).  The  bottom  curve  is  the 
number  of  scheduling  options  explored  by  the  BB,  i.e.:  branches  of  the  BB  algorithm.  Although  the  BB 
cuts  the  number  of  branches  explored  by  a  large  amount,  the  exponential  growth  of  the  BB  algorithm  is 
evident.  When  N=10  the  run  times  for  the  BB  can  be  in  the  hours. 


Figure  13b:  A  follow  up  on  the  previous  figure,  showing  the  worst  case  numbers  for  the  10  runs  per 
problem  size. 


27-23 


5.  Constructive  Heuristics 

Priority  Dispatch 
Look  Ahead 


Constructive  heuristics  build  a  schedule  from  scratch.  They  usually  use  rules  for  selecting  jobs  and  rules 
for  allocating  resources  to  the  jobs.  They  can  also  be  used  to  "add"  to  an  existing  schedule,  usually  treating 
the  existing  commitments  as  hard  constraints.  The  first  constructive  heuristic  that  we  describe  is  called  the 
Priority  Dispatch  method,  a  method  with  many  possible  variations.  The  main  advantage  of  this  method  is 
that  it  is  very  easy  to  understand  (three  distinct  phases)  and  runs  very  fast.  The  Look  Ahead  is  similar  to 
the  Priority  Dispatch  approach  but  uses  a  much  more  "intelligent"  allocation  step.  The  Look  Ahead 
approach  tends  to  uncover  the  interactions  between  jobs  in  the  queue  and  therefore  anticipates  certain 
conflicts. 

Priority  Dispatch  (PD):  This  method  begins  with  a  batch  of  unscheduled  jobs  and  uses  three  phases: 
Selection,  Allocation,  and  Optimization,  to  construct  a  schedule.  There  are  many  possible  variations  on 
these  steps,  and  here  we  present  a  representative  version. 

Phase  I:  Selection.  Rank  the  contending,  unscheduled,  jobs  according  to: 

p- 

rank(jobi)  =  fj,  where:  f-  =  — ! — 

'  dimin 

This,  of  course,  is  just  one  way  to  rank  the  jobs.  Experience  shows  that  this  it  is  a  very  good  way  to  rank 
the  jobs  (for  randomly  generated  sets).  Intuition  explains  this  by  the  fact  that  the  feature  f  measures  the 

"worth  per  unit  length"  of  the  job^^,  so  jobs  with  a  high  value  of  f  tend  to  have  a  high  value  of  p  mixed 
with  a  small  duration.  Such  jobs  make  relatively  large  contributions  to  the  overall  score  while  conflicting 
with  very  few  other  jobs  (remember,  we  are  speaking  intuitively  here). 

Phase  II:  Allocation.  Take  the  job  at  the  top  of  the  queue  from  phase  1  and  consider  scheduling  the 
minimum  duration  (d„i„).  If  the  there  is  room  for  it  within  its  window  of  opportunity  (that  does  not 
conflict  with  previously  scheduled  jobs)  consider  the  best  spot:  where  the  area  under  the  score  profile  is 
largest.  If  there  is  no  room  for  the  job  skip  to  the  next  job  in  the  queue^^.  By  choosing  to  place  d„,|„,  as 
opposed  to  longer  durations  (such  as  d^j,^),  we  are  getting  the  job  on  the  schedule  while  conflicting  with 
the  least  number  of  other  jobs  1^.  This  is  an  interesting  balance  of  "greedy"  and  "altruistic"  strategies: 


1 

This  is  closely  related  to  the  "weighted  shortest  processing  time  (WSPT)"  algorithm  that  is  proven  optimal  for 
specific  single  machine  sequencing  problems  [14]. 

We  do  not  consider  bumping  a  scheduled  task  here.  Such  a  variation  can  be  easily  added,  but  we  have  found  this 
to  be  effective  only  when  the  job  queue  is  poorly  sorted  (i.e.:  using  a  bad  sort  key). 

We  know  from  prior  simulations  that  this  favors  getting  more  jobs  on  the  schedule  at  the  expense  of  some 
worth.  Getting  more  jobs  on  the  schedule  is  not  always  mentioned  as  a  criteria  but  it  almost  always  is  beneficial  . 
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place  the  job  at  its  best  spot  but  the  job  only  gets  its  minimum  duration.  Continue  through  the  job 
queue  until  all  jobs  have  had  one  chance  to  be  scheduled. 

Phase  III:  Optimization.  After  phases  I  and  II  are  complete,  a  final  phase  runs  through  the  scheduled 
jobs,  this  time  ranked  according  to  rank(jobi)  =  Pi,  and  grows  each  job  to  the  left  and  right  until  it  meets 
another  activity,  reaches  its  maximum  allowed  duration  (d'„„),  or  reaches  the  edge(s)  of  its  window  of 
opportunity.  This  is  a  form  of  Hill  Climbing,  and  in  fact,  any  hill  climber  algorithm  could  be  plugged  in 
at  this  step,  but  the  theme  of  the  PD  is  to  keep  it  simple.  Estimated  order  of  the  PD  algorithm:  O(nlogn). 

The  advantages  of  the  PD  approach  are  that  it  is  nicely  modular,  the  scheduling  strategy  is  easy  to 
understand,  it  runs  very  fast,  and  produces  acceptable  schedules  most  of  the  time.  The  modularity  and 
simplicity  support  the  addition  of  many  variations  that  may  prove  useful  for  a  specific  application.  The 
accuracy  of  the  PD,  however,  can  be  very  poor  for  certain  sets  of  competing  jobs.  In  some  such  cases 
additional  rules  can  be  added  to  the  Selection  and  Allocation  phases.  If  the  desired  accuracy  still  cannot  be 
achieved,  we  recommend  going  to  the  Look  Ahead  method,  which  sacrifices  some  speed  and  simplicity  to 
get  a  significant  increase  in  accuracy. 


Figure  14a:  The  PD  approach  uses  three  phases.  The  Sort  phase  ranks  the  jobs  and  the  Allocate  phase 
places  them  on  the  schedule.  When  all  the  jobs  are  allocated  the  Optimization  phase  does  some  fine  tuning 
(e.g.:  take  advantage  of  unused  resources). 


Figure  14b:  There  are  many  variation  on  the  PD  phases.  In  this  sketch  the  PD  ranks  the  jobs  and  after 
each  allocation  re-ranks  the  jobs  (their  ranking  features  may  be  affected  by  a  job  that  just  got  scheduled).^^ 
The  allocation  phase  allocates  only  the  minimum  duration  to  the  job,  and  the  optimization  phase  "grows" 
the  jobs  that  made  the  schedule  into  unused  spaces. 

There  are  cases  where  we  re-sort  the  queue  after  each  allocation  (e.g.;  when  the  allocation  of  a  job  changes  the 
rankings  of  the  jobs  remaining  in  the  queue). 
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Sorting  the  Jobs 

job 

p - ~""  - 

Rank? 

Some  Possibilities: 

°  Rank  =  Priority 
°  Rank  =  Priority/dmin 
°  Rank  =  dmin 
°  Rank  =  dmax 
°  Rank  =  dmax  -  dmin 
°  Rank  =  wf  -  wO  -  dmin 
°  Rank  =  Priority  *  dmax 
_ ete; _ 

Figure  15:  There  are  many  ways  to  sort  the  jobs.  In  most  cases  the  parameters  of  the  job  are  used  to 
compute  a  value  that  is  used  as  the  sort  "key".  In  some  cases  the  sort  key  depends  on  the  current  "state"  of 
the  schedule,  for  example  if  rank  =  [  (time  available  for  scheduling)  -  dmin ],  then  the  value  will  change  as 
jobs  are  scheduled  that  use  some  of  the  available  time. 


Figure  16:  The  "allocation"  step  in  the  PD  uses  a  maximize-minimize  strategy:  although  the  minimum 
duration  is  scheduled,  it  is  scheduled  at  the  best  available  time  (with  respect  to  its  score  profile). 

Look  Ahead  Algorithm  (LA):  A  powerful  modification  can  be  made  to  the  PD  method  at  the 
allocation  step,  when  the  job  is  placed  at  the  best  spot  (as  determined  by  area  under  the  suitability  function). 
This  can  be  greatly  improved  by  determining  best  by  looking  ahead'.  To  score  the  possible  placements  of  a 
job  let  the  dispatch  queue  continue  scheduling  (without  actually  making  the  allocations,  but  obeying  all  the 
constraints)  and  compute  the  final  Q.  Do  this  for  each  possible  placement  of  the  job.  The  position  that 
scores  the  highest  is  declared  "best"  and  the  job  is  allocated  there,  and  the  iteration  continues  (no 
backtracking).  This  accounts  for  interactions  of  the  job  with  as  yet  unscheduled  tasks  {down-stream  jobs). 
This  adds  a  noticeable  amount  of  processing  time  but  we  gain  a  significant  amount  of  quality.  Estimated 
order:  0(n"). 
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Figure  1 7:  The  Look  Ahead  algorithm  scores  each  scheduling  option  by  hypothetically  scheduling  the 
rest  of  the  queue  (using  the  PD)  and  computing  the  score  of  the  hypothetical  schedule.  The  option  that 
scores  the  highest  is  the  option  that  is  actually  scheduled  and  then  the  system  moves  to  the  next  job  in  the 
queue. 


Look  Ahead  Algorithm 


— 

scheduled  jobs 

confer  al^lc^opiions 

A 

m  fTTura 

unscheduled  jobs 


,,/Tv  .  After  evaluating  each 

option,  schedule  the  job 
at  the  highest  scoring 
option,  and  move  on  to 

opiion  “score"  f,gxt  job  in  the 

I  queue. 

Wiih  ihaTopiion  in  place; 

•  apply  the  PD  to  all  the 
un^hcdulcd  jobs 

•  compute  the  score  of  the 
resulting  schedule _ , 


Figure  18:  It  is  important  to  realize  that  the  Look  Ahead  Algorithm  considers  all  the  legal  scheduling 
options  for  each  job,  and  is  not  restricted  to  considering  the  minimum  duration  as  is  the  PD. 


The  advantages  of  the  Look  Ahead  algorithin  is  that  it  is  an  excellent  trade  off  of  speed  and  simplicity  for 
accuracy.  The  algorithm  is  relatively  fast,  modular,  relatively  easy  to  understand,  and  achieves  its  accuracy 
by  dynamically  detecting  down-stream  conflicts.  The  effectiveness  of  this  algorithm  highlights  the  fact  that 
the  interactions  (i.e.:  competition)  between  the  contending  jobs  must  be  evaluated  in  detail  to  uncover  the 
dependencies.  Intuitively,  the  LA  is  measuring  the  sensitivity  of  the  down-stream  jobs  to  the  particular 
placements  of  the  job  currently  being  allocated. 
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6,  Fuzzy  Logic 


W  e  treat  Fuzzy  Logic  as  an  extension  of  the  rule-based  approach.  For  example,  consider  the  following 
rule: 

If  Priority  is  high  and  the  Laxity  is  low  Then  the  Rank  is  high. 

From  a  fuzzy  point  of  view  Priority,  Laxity  and  Rank  are  linguistic  variables  that  are  qualified  by  terms 
such  as:{/ow,  medium,  high).  Membership  in  the  qualifying  group  is  measured  by  a  membership  function 
such  as  those  shown  in  figure  19. 


Figure  19:  A  typical  set  of  fuzzy  membership  functions  for  the  linguistic  variable  Worth. 
For  a  given  set  of  jobs  we  can  compute  specific  values  for  Priority  and  Laxity  with  formulas  such  as: 


Priority(jobi)  =  P,  ■  d,,,, 

LaxityOobi)  =  Wj  - 

Where  p  is  the  Job's  priority  and  W  is  the  width  of  the  job's  window  of  opportunity  (WCP).  The  range 
of  values  of  Priority  and  Laxity  over  a  given  job  set  would  be  used  to  calibrate  the  membership  curves  (this 
provides  the  appropriate  labeling  the  x-axis:  minimum  to  maximum  values).  With  these  values  we  can 
generate  rankings  for  the  jobs  using  standard  fuzzy  logic:  "fuzzy  and"  and  "fuzzy  or"  [13]. 


Rules  of  Thumb 


Top  of  the  Queue: 

High  Priority 
Low  Laxity 
Short  Jobs 

Bottom  of  the  Queue: 

Low  Priority 
High  Laxity 

_ Long  Jobs _ 

Figure  20:  Some  basic  rules  of  thumb  that  lend  themselves  to  a  fuzzy  implementation. 


1 7 

We  are  "overloading"  the  term  "priority".  In  the  fuzzy  sense  (Priority)  we  think  of  it  as  a  generic  term  for  the 
value  of  the  job.  In  the  original  sense  "priority"  is  the  specific  priority  assigned  to  the  job  when  it  is  accepted 
into  the  job  queue. 
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Rank 

Figure  21a:  By  scaling  the  measured  values  of  Priority,  Laxity  and  Rank  to  the  range  [0,100],  we 
can  compute  values  (crisp)  and fuzzify  them  into  the  range  [0,1  J. 
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Figure  21b:  If  we  introduce  the  linguistic  terms:  very  low,  low,  medium,  high,  and  very  high,  we  can 
create  25  fuzzy  rules  for  the  various  values  of  Priority  and  Laxity.  For  example,  the  top  left  entry  in  the 
chart  says:  "if  the  Priority  is  very  low  (VL)  and  the  Laxity  is  very  low  (VL)  then  the  Rank  is  low  (L)". 

The  chart  in  figure  21b  expresses  the  strategy  that  the  ranking  of  a  job  should  be  lowered  if  the  laxity  is 
high  and  raised  if  the  laxity  is  low.  Looking  at  the  first  column  in  the  figure  21b  we  see  that  the  laxity  is 
very  low  and  the  priority  ranges  from  very  low  to  very  high,  while  entries  in  the  chart  indicate  that  the 
ranking  is  one  step  above  the  priority:  when  the  priority  is  very  low  the  ranking  is  low,  when  the  priority 
is  low  the  ranking  is  medium,  etc.  A  similar  thought  is  expressed  by  the  other  columns  in  the  chart. 
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FUZZY  ALLOCATION 


If  Ranking  is  high 
and 

LeadTime  '\%high 
and 

ConstraintVioIation  is  low 


Then  Allocation  is  high. 


constraint 

violation 


Fuzzy  Constraints 

Figure  22a:  We  can  go  further  with  the  fuzzy  concept  and  apply  it  to  the  "allocation"  step  in  the  dispatch 
approach.  A5  the  figure  indicates,  the  constraints  themselves  can  be  modeled  in  a  fuzzy  way,  allowing 
degrees  of  constraint  violation  depending  on  such  things  as  the  lead  time  and  importance  of  the  job  being 
allocated. 


FUZZY  QUALITY 


Multiple  Criteria: 

Quality  depends  on 
Ql,...,Qn 

If  Worth  is  high 

and 

Distribution  Error  is  low 
and 

#Jobs  is  high 
Then  Quality  is  high. 


Figure  22b:  In  situations  with  multiple  criteria  for  measuring  the  quality  of  a  schedule,  we  can  apply 
fuzzy  concepts  for  combining  the  disparate  measures  of  quality. 

When  we  put  the  fuzzy  ranking,  fuzzy  allocation,  and  fuzzy  quality  together  we  get  what  might  be  called 
fuzzy  scheduling".  The  idea  of  "fuzzy  scheduling"  fits  nicely  with  the  progressive  refinement  approach 
described  later  in  this  report:  the  constraints  can  become  less  and  less  fuzzy  as  the  day  of  operation 
approaches. 

The  fuzzy  approach  has  two  attractive  features: 

•  It  adds  a  "layer  of  abstraction"  that  allows  the  designer  to  discuss  the  nature  of  the  rules  in  a 
simple  way. 

•  It  has  a  built  in  tendency  to  be  insensitive  to  small  disturbances  (the  “fuzzy  and”  and  “fuzzy 
or”  operations  tend  to  screen  out  small  changes). 

The  disadvantage  to  the  fuzzy  approach  are: 

•  The  membership  functions  are  not  always  easy  to  define  and  calibrate. 

•  It  requires  some  additionally  processing  time. 
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We  believe  that  the  advantages  will  outweigh  the  disadvantages  when  fuzzy  logic  is  applied  to  the  PD  rule 
structure.  Intuitively,  the  fuzzy  approach  seems  consistent  with  the  fact  that  the  objective  function  (e.g..  a 
combination  of  the  measures  of  all  the  scheduling  criteria)  is  not  as  accurate  as  we  would  like  it  to  be.  It 
tends  to  have  subjective  weightings  of  parameters  that  is  more  like  adding  apples  and  oranges  than  a  precise 
mathematical  representation.  Add  in  the  fact  that  the  criteria  can  be  vague  and  can  be  rapidly  changing,  and 
it  appears  that  a  fuzzy  approach  would  help  filter  out  meaningless  variations.  We  are  currently  investigating 
this  approach. 

Finally,  reference  [6]  introduces  the  "0-projection".  This  method  serves  two  purposes;  1.  it  is  an  initial 
version  of  a  fuzzy  logic  approach  to  the  PD;  and  2.  it  emphasizes  the  dynamic  aspect  of  the  PD.  the 
queue  can  be  sorted  after  each  scheduling  decision  without  a  significant  amount  of  additional  processing 
time. 


Figure  23:  Consider  the  case  of  two  "features",/,  andf2,  used  to  rank  the  jobs  (e.g.:  worth  and  laxity). 
After  the  features  are  normalized  to  the  [  0,700/  range  each  job  is  a  represented  by  a  point  in  the  square  in  the 
figure.  If  the  points  are  projected  onto  the  various  lines  (0°,  45°,  etc.)  we  would  be  sorting  the  jobs 
according  to  a  "high"  or  "low"  measure  of  the  features.  For  example,  projecting  onto  the  0°  line  is  the 
same  as  ranking  the  jobs  exclusively  according  tof,,  projecting  onto  90  sorts  them  fcy/j,  projecting  them 
onto  45°  sorts  them  by  1/2- f,  +  1/2/2,  etc.  Also  notice  that  if  we  project  onto  270°  then  we  are  sorting 
according  tof,  but  in  ascending  order.  This  is  the  basis  of  the  O-filter  method  described  in  [6]. 

Additionally,  the  points  in  the  square  in  figure  23  would  be  moving  if  we  were  re-ranking  the  jobs  after 
each  allocation  step.  For  example,  if  we  were  ranking  the  jobs  so  that  a  high  f,  and  high  fj  were  at  the  top 
of  the  queue  (i.e.:  0  =  45°)  then  the  unscheduled  jobs  would  be  migrating  to  the  upper  left  of  the  square. 
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7.  Improvement  Heuristics 

Hill  Climbing 
Simulated  Annealing 
Tabu  Search 


1  here  are  many  algorithms  that  would  be  classified  as  "improvement"-based:  begin  with  a  schedule  and 
successively  improve  it.  One  way  to  improve  the  schedule  might  be  to  strip  everything  off  the  schedule 
and  then  rebuild  it  with  a  constructive  heuristic,  so  it  is  clear  that  the  distinction  between  "improvement" 
and  "constructive"  can  be  vague  at  times,  and  to  some  extent  they  represent  two  ends  of  a  full  spectrum  of 
methods.  For  example,  the  PD  algorithm  uses  a  mix  of  improvement  and  constructive  techniques:  the  first 
two  phases  are  constructive,  while  the  third  phase  is  improvement-based  (the  "optimization"  phase).  The 
optimization  phase  of  the  PD  is,  in  fact,  a  form  of  Hill  Climbing. 

Hill  Climbing  (HC):  Hill  climbing  algorithms  try  to  improve  the  quality  of  a  given  schedule  by  a 
small  change  in  one  or  several  of  the  variables  (i.e.:  moves).  Such  an  algorithm  could  be  used  in  the 
optimization  phase  of  the  PD  algorithm.  The  variables  in  this  case  are  the  start  and  stop  times  for  a  each  of 
the  jobs  on  the  schedule.  For  the  WCP  we  know  that  there  may  be  jobs  that  are  not  scheduled,  so  we  could 
consider  bumping  jobs  off  the  schedule  and  placing  new  ones  on,  but  for  this  analysis  we  will  fix  the  jobs 
on  the  schedule.  Neither  do  we  do  consider  swapping  the  order  of  any  jobs.  Thus,  for  now  we  only  consider 
sliding,  shrinking,  and  growing  the  jobs  by  one  time  unit  (within  the  window,  duration,  and  capacity 
constraints). 


Figure  24:  Some  of  the  "moves"  that  might  be  applied  to  a  job  that  is  on  the  schedule.  Notice  that  the 
moves  must  obey  the  widow  and  duration  constraints.  The  Slide  and  Grow  moves  might  conflict  with  a 
neighboring  job  on  the  schedule,  in  which  case  the  decision  must  be  made  to  either  not  allow  the  move  or 
accommodate  it  by  modifying  the  neighbor  ( e.g.:  slide,  shrink  or  bump  the  neighbor). 
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Figure  25:  Hill  Climbing  is  complicated  by  the  fact  that  an  incremental  change  in  one  parameter  may 
cause  a  ripple  ejfect  as  it  affects  other  jobs  on  the  schedule.  For  example,  growing  a  job  to  the  right  might 
conflict  with  a  neighboring  job,  as  shown  in  the  figure  (job  i  bumps  into  job  j). 

A  change  in  a  variable  induces  a  change  in  Q  (the  measure  of  schedule  quality).  In  theory  we  would 
consider  all  possible  changes  (one  time  unit)  and  take  the  best  one; 

max^,  {  AQ/Av  } 


Figure  26:  Hill  Climbing  considers  all  possible  incremental  changes  in  the  state  of  the  schedule  and  then 
picks  the  one  that  increase  Q  by  the  most  (largest  AQ).  If  all  AQ's  are  negative  the  algorithm  halts. 


After  making  a  change  the  algorithms  iterates.  The  algorithm  halts  when  no  Av  produces  a  AQ  >  0.  In 
practice  this  means  computing  all  these  possibilities  at  each  iteration,  consuming  a  significant  amount  of 
processing  time.  A  common  compromise  is  to  rank  the  jobs  (most  likely  to  improve  Q  the  most  at  the 
top)  and  compute  only  the  variable  changes  for  one  Job  at  a  time,  take  the  best  of  those,  and  move  to  the 
next  job,  etc. 


Keep  in  mind  that  a  move  is  could  cause  a  constraint  violation.  Such  a  violation  can  come  from  any  of  the 
following; 


•  window  constraints;  each  jobj  has  a  window  of  opportunity  [w,)i,  w,;],  and  we  cannot  schedule 
jobj  outside  this  window. 

•  duration  constraints;  each  jobj  has  a  minimum  and  maximum  duration  [d„|„i,  and  the 

scheduled  duration  (dj)  must  satisfy  d„j„i  <  dj  < 

•  capacity  constraints;  each  time  unit  can  only  have  one  job  assigned  to  it  (i.e.;  resource 
capacity  =  1). 

Moves  that  violate  the  window  and  duration  constraints  are  considered  illegal  and  are  not  tolerated.  The 
three  algorithms  that  we  describe  below  are  distinguished  by  how  they  handle  a  potential  capacity  violation. 
For  example  if  jobj  has  an  immediate  neighbor  on  the  right  (e.g.;  no  time  gap  between  the  end  of  jobj  and 
the  beginning  of  jobj)  then  growing  job,  one  time  unit  to  the  right  will  cause  a  conflict  (i.e.;  jobj  collides 
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with  jobj).  We  have  a  few  options  at  this  point:  consider  the  move  illegal  or  try  to  move  jobj  to  make 
room  for  the  move.  The  simplest  algorithm  is  to  freeze  all  the  other  jobs  when  we  try  to  move  a  given 
job,  that  is,  make  no  attempt  to  adjust  neighbors  to  accommodate  moves.  Below,  we  consider  three 
algorithms,  beginning  with  the  simplest. 

Simple  Hill  Climbing  Algorithm:  For  all  jobs  on  the  schedule  consider  all  the  possible  legal 
moves  (1  time  unit).  A  move  that  violates  any  of  the  constraints  (window,  duration,  capacity)  is  considered 
illegal.  Take  the  legal  move  that  increases  Q  the  most,  then  iterate.  The  algorithm  halts  when  there  is  no 
legal  move  with  AQ  >  0.  Thus,  there  are  6  moves  to  consider  for  each  jobi: 

1 .  grow_R(i) 

2.  grow_L(i) 

3.  slide_R(i) 

4.  slide_L(i) 

5.  shrink_R(i) 

6.  shrink_L(i). 

Local  Pressure  Algorithm:  This  algorithm  is  the  same  as  the  "simple  algorithm"  except  that  we 
consider  making  room  for  a  move  if  a  collision  occurs.  If  jobj  collides  with  jobj  then  we  consider  sliding 
and/or  shrinking  jobj  to  accommodate  the  move.  Thus,  there  are  14  possible  moves,  the  first  6  are  when 
there  is  no  collision  and  the  last  8  are  when  jobj  collides  with  jobj: 

1.  grow_R(i) 

2.  grow_L(i) 

3.  slide_R(i) 

4.  slide_L(i) 

5.  shrink_R(i) 

6.  shrink_L(i) 

7.  grow_R(i)  &  slide_R(j) 

8.  grow_R(i)  &  shrink_L(j) 

9.  grow_L(i)  &  slide_L(i) 

10.  grow_L(i)  &  shrink_R(j) 

11.  slide_R(i)  &  slide_R(j) 

12.  slide_R(i)  &  shrink_L(j) 

13.  slide_L(i)  &  slide_L(j) 

14.  slide_L(i)  &  shrink_R(j). 

If  it  is  not  possible  to  move  job^  without  causing  another  constraint  violation  then  the  move  is  not  legal. 
We  could  say  that  the  local  pressure  algorithm  only  goes  "one  deep"  since  it  only  considers  moving  an 
immediate  neighbor.  We  could  easily  extend  the  algorithm  to  consider  going  "two  deep"  by  recursively 
applying  the  same  moves  to  the  job  that  jobj  collides  with,  etc.  For  the  local  pressure  algorithm  we  limit 
the  ripple  to  the  immediate  neighbors  for  now. 

Neighborhood  Search  (Radius  =  1):  For  each  jobj,  consider  its  immediate  neighbors.  An 
immediate  neighbor  is  a  job  that  has  no  time  gap  between  it  and  the  start  or  stop  of  job;.  There  can  be 
either  2,  1  or  0  immediate  neighbors.  If  there  are  0  immediate  neighbors  we  consider  the  6  simple  moves, 
as  above. 

If  there  are  2  immediate  neighbors,  then  there  is  one  on  the  left  (job_L)  and  one  on  the  right(  job_R). 
Define  a  "window"  of  time  for  rescheduling  these  three  jobs:  job_L,  jobj,  job_R.  We  freeze  all  other  jobs 
on  the  schedule  outside  this  window.  The  left  limit  of  the  window  is  the  maximum  of  {start  of  job 
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Figure  27:  For  the  job,  we  consider  a  neighborhood  of  radius  1.  In  the  sketch  jobi  has  immediate 
neighbors  on  both  sides.  The  rescheduling  neighborhood  is  defined  to  be  1  time  unit  to  the  left  and  right  of 
the  neighbors  respectively.  If  there  is  no  gap  to  the  left  of  job_L  then  the  Left  Limit  is  taken  to  be  the 
start  of  job_L,  and  if  there  is  no  gap  to  the  right  of  job_R  the  Right  Limit  is  taken  to  be  the  stop  of 
job_R.  All  jobs  that  are  scheduled  outside  this  neighborhood  are  considered  frozen  (i.e.:  cannot  be  adjusted 
in  any  way). 


Figure  28:  Within  the  rescheduling  window  we  consider  only  moves  of  1  time  unit.  That  is,  the  start 
and  stop  of  a  job  can  only  be:  increment  means  "add  one");  leave  the  same  (  0  );  or  decrement  (  - 
means  "subtract  one").  All  such  adjustments  must  satisfy  the  constraints  (window,  duration,  capacity).  We 
can  then  view  the  problem  as  a  "labeling"  problem:  label  the  start  and  stop  of  each  of  the  3  jobs  with  a  +  , 
"0"  or  and  find  the  labeling  that  produced  the  highest  AQ.  Notice  that  since  the  stop  of  job_L  is 
initially  equal  to  the  start  of  jobi,  there  are  only  6  ways  to  label  them  without  causing  a  capacity  violation. 
For  example,  if  the  stop  ofjobJL  is  labeled  "+"  then  the  only  way  to  label  the  start  of  jobi  is  "+".  The 
maximum  number  of  possible  labeling  is  =  3^-6^ '  where  k  —  2  Radius  +  1,  and  Radius  —  I.  Thus,  the 
maximum  number  of  labelings  when  Radius  =  I  is  324.  Many  of  these  labelings  will  be  illegal  because  of 
window  and  duration  constraints,  and  possibly  because  the  Left  and  Right  Limits  do  not  allow  -  and  + 
respectively. 
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Figure  29:  As  wc  label  the  start  and  stop  times,  from  left  to  right,  the  constraints  propagate.  For 
example  the  start  of  job_L  is  constrained  by  the  Left  Limit  and  its  left  window  limit  (wO).  When  the  start 
of  job_L  is  labeled  then  the  duration  constraints  and  right  window  constraint  for  job_L  constrain  the  labels 
for  the  stop  of  job_L.  Once  the  stop  of  job_L  is  labeled  then  capacity  constrains  the  label  for  the  start  of 
jobi  (and  so  does  the  left  window  limit  for  jobi),  etc. 


We  do  the  same  thing  if  jobi  has  0  on  1  immediate  neighbors.  For  example,  if  jobi  has  0  immediate 
neighbors  then  we  consider  all  the  ways  to  move  the  start  and  stop  by  one  time  unit.  This  includes  all  the 
6  moves  described  in  the  "simple"  algorithm,  as  well  as  some  different  ones  (e.g.:  simultaneously  shrinking 
on  both  the  left  and  right),  for  a  total  of  9  possible  moves. 


Also  notice  that  if  jobi  and  jobj  are  immediate  neighbors  of  each  other,  and  have  no  other  immediate 
neighbors,  than  the  calculation  that  we  do  when  we  come  to  jobi  need  not  be  repeated  when  we  come  to 
jobj. 


consider  all  the  (small)  adjustments  to  schedule  parameters  that  might  improve  the  schedule  and  choose  the 
one  that  improves  the  schedule  the  most,  and  iterate. 


Slide  Right 

Grow  Right 

Shrink  Right 

(SLR) 

(Gr_R) 

(Sh_R) 

\ 

/ 

Figure  30:  Some  of  the  "moves"  that  might  be  applied  to  a  job  that  is  on  the  schedule.  Notice  that  the 
moves  must  obey  the  widow  and  duration  constraints.  The  Slide  and  Grow  moves  might  conflict  with  a 
neighboring  job  on  the  schedule,  in  which  case  the  decision  must  be  made  to  either  not  allow  the  move  or 
accommodate  it  by  modifying  the  neighbor  (e.g.:  slide,  shrink  or  bump  the  neighbor). 
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Figure  3 1 :  Hill  Climbing  is  complicated  by  the  fact  that  an  incremental  change  in  one  parameter  may 
cause  a  ripple  effect  as  it  affects  other  jobs  on  the  schedule.  For  example,  growing  a  job  to  the  right  might 
conflict  with  a  neighboring  job,  as  shown  in  the  figure  (job  i  bumps  into  job  j).  Our  algorithms  limit  the 
amount  of  ripple  allowed:  if  we  cannot  accommodate  the  move  by  with  a  simple  adjustment  of  the 
neighbor  (slide  or  shrink  the  neighbor,  but  do  not  bump  the  neighbor  off  the  schedule)  we  dis-allow  the 
move. 


Figure  32:  Hill  Climbing  considers  all  possible  incremental  changes  in  the  state  of  the  schedule  and  then 
picks  the  one  that  increase  Q  by  the  most  (largest  AQ).  If  all  AQ's  are  negative  the  algorithm  halts. 

We  have  explored  a  variety  of  hill  climbing  strategies  (e.g.:  bump  vs.  no-bump).  The  simplest  version  is 
to  grow  the  jobs  on  the  schedule  (highest  priority  first)  within  the  constraints  (window  and  duration  limits) 
and  without  moving  any  neighbors.  This  is  the  same  as  the  "optimization  phase  of  the  PD  algorithm. 

We  have  come  to  view  the  Hill  Climbing  algorithms  as  in  either  of  two  classes;  simple  enough  that  they 
are  thought  of  as  variations  on  the  third  phase  of  the  PD  algorithm;  or  as  sufficiently  complicated  that  they 
then  form  the  backbone  of  both  the  Simulated  Annealing  and  Tabu  Search  approaches.  That  is,  the 
complications  introduced  by  a  sophisticated  hill  climber^  ^  will  increase  the  accuracy  of  the  schedules  but  at 
the  expense  of  run  time  and  complexity.  We  have  not  found  this  accuracy-speed-simplicity  trade  off  to  be 
worth  it:  the  relatively  small  increase  in  accuracy  comes  at  a  high  price.  In  fact,  the  Simulated  Annealing 
and  Tabu  Search  methods  fall  into  the  same  category.  Although  they  can  provide  significant  improvements 
in  accuracy,  they  also  introduce  significant  increases  in  run  time  (several  orders  of  magnitude  longer  than 
the  PD)  and  complexity. 

Simulated  Annealing:  This  method  begins  the  same  way  as  the  Hill  Climber  but  adds  a  random 
component  that  makes  random  decisions  in  the  beginning,  and  gradually  less  random  decisions  as  it 
progresses,  and  finally  converges  to  a  deterministic  Hill  Climbing  method. 

Consider  the  moves  that  were  possible  for  the  Hill  Climber,  and  add  moves  that  might  have  been  ignored 
because  they  were  certain  to  produce  a  negative  AQ.  At  any  state  of  the  schedule  we  can  randomly  pick  one 

1  ^  For  example,  we  might  allow  swapping  position  with  the  neighbor,  and  other  such  "local"  moves. 
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of  the  moves  and  consider  the  AQ  associated  with  it.  The  following  rule  is  applied^^:  map  the  value  of  AQ 
through  the  sigmoidal  function  shown  in  the  diagram  below;  this  produces  the  value  f(AQ)  which  is  some 
number  between  0  and  1 .  Then  we  compare  that  number  to  a  randomly  generated  number  between  0  and  1 
and  use  the  rule:  if  the  random  number  is  greater  than  AQ  then  do  not  make  the  move,  and  iterate;  if  the 
random  number  is  less  than  or  equal  to  the  f(AQ)  then  make  the  move  and  iterate. 


Figure  33:  The  "accept/reject"  method  used  in  Simulated  Annealing  depends  on  the  shape  of  the  sigmoid, 
which  in  turn  is  controlled  by  the  value  of  the  "temperature"  T. 

Notice  that  the  steeper  the  curve  the  more  likely  we  are  to  accept  moves  with  positive  AQ  and  to  reject 
moves  with  negative  AQ  (i.e.:  similar  to  the  Hill  Climber).  Conversely,  the  flatter  the  curve  the  more 
likely  we  are  to  be  randomly  accepting  and  rejecting  moves  (independent  of  the  value  of  AQ).  The 
parameter  "T"  is  called  the  "temperature”  and  when  the  temperature  is  high  the  moves  are  random  and  when 
the  temperature  is  low  the  algorithm  behaving  like  the  Hill  Climber.  Thus,  SA  requires  a  "temperature 
schedule"  that  usually  begins  with  a  high  value  of  T  and  successively  lowers  it  until  it  is  small  enough  that 
Hill  Climbing  takes  over  and  then  halts  when  no  improvements  can  be  made  to  the  schedule.  Along  the 
way,  many  schedules  are  "observed"  so  the  method  implies  that  we  keep  track  of  the  best  schedule  that  we 
have  seen  through  the  whole  process. 

The  advantage  to  the  SA  approach  is  that  it  avoids  getting  stuck  at  local  optima  and  can  produce  some  of 

the  best  results  for  a  wide  range  of  NP-Complete  problems^®.  The  drawback  is  that  it  introduces 
randomness  into  the  schedule  making  process  so  it  may  not  be  repeatable  and  it  difficult  to  understand 
"how”  the  result  was  arrived  at  (in  layman's  terms).  The  major  drawback,  however,  is  that  the  SA  results 
tend  to  come  at  the  expense  of  significant  increases  in  run  time  (we  have  ruled  this  method  out  for  this 
reason). 

It  is  worth  mentioning  at  this  point  that  an  "agent-based"  approach  could  use  the  same  "moves"  while 
viewing  each  job  as  an  "agent"  that  "looks"  at  its  environment  (competing  jobs,  neighbors)  and  makes  a 
"move".  In  this  view  the  "jobs"  are  interacting  like  a  community  of  agents,  observing  their  environment, 
communicating,  negotiating,  etc.  We  have  only  explored  limited  versions  of  such  an  approach,  and  in  fact, 
ended  up  implementing  simulated  annealing  in  its  place. 

Tabu  Search:  We  did  not  simulate  a  Tabu  Search  approach,  and  although  we  plan  to  test  such  a  method 
in  the  future,  we  extrapolated  from  our  experience  with  Hill  Climbing,  Simulated  Annealing,  and  Genetic 
Algorithms  to  come  to  the  conclusion  that  it  will  introduce  unacceptable  increases  in  run  time  and 
unacceptable  complexity.  Tabu  Search  does,  however,  show  promise  in  that  it  minimizes  one  of  the  major 
weaknesses  of  the  SA  approach  (it  removes  the  randomness  of  the  decisions),  while  exploiting  its  strength 
(visit  local  optima,  but  do  not  get  stuck  there).  Our  fear  is  that  it  will  lead  to  a  backtracking  search, 
however  controlled  and  intelligent  it  might  be.  It  is  our  experience  that  backtracking  must  be  introduced  in 


In  some  SA  applications  the  following  rule  is  applied;  if  AQ  >  0  then  make  the  move  (no  randomness)  and 
iterate. 

90 

For  many  problems  where  the  optimal  solution  is  unknown,  the  "best  known"  results  are  almost  always 
gotten  with  a  SA  approach,  and  they  tend  to  be  used  for  comparisons. 
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extremely  limited  ways  or  the  run  times  quickly  become  unacceptable.  This  intuition,  combined  with  the 
simplicity  issue,  points  us  away  from  the  Tabu  Search  method.  But  in  all  fairness  this  is  an  open  question. 
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DEVELOPMENT  OF  A  NEW  NUMERICAL 
BOUNDARY  CONDITION  FOR  PERFECT  CONDUCTORS 


Jeffrey  L.  Young 
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University  of  Idaho 


Abstract 


A  new  numerical  boundary  condition  is  derived  that  extends  Maxwell’s  equations  to  the  surface  of  a  perfect 
conductor.  This  condition  explicitly  shows  the  interrelationship  between  surface  charge,  surface  current 
and  tangential  electric  field.  Particularly,  the  boundary  equations  are  similar  to  Euler’s  linearized  inviscid 
acoustic  equations,  where  current  density  is  akin  to  fluid  velocity  and  charge  density  is  akin  to  mass  density; 
the  normal  derivative  of  the  tangential  electric  field  is  the  source  term  in  the  equivalent  momentum  equation. 

The  two  sets  of  equations,  Maxwell’s  and  Euler’s,  are  discretized  using  the  finite-volume,  MUSCL  proce¬ 
dure  along  with  the  two-stage  Runge-Kutta  integrator.  Specifically,  the  spatial  discretization  employs  the 
flux-splitting  procedure  along  windward-biased  differencing.  Such  an  approach  captures  the  solution  within 
its  domain  of  influence,  as  required  from  a  solution  of  a  set  of  hyperbolic  equations.  Formally,  the  resulting 
scheme  is  third-order  accurate  in  space  and  second-order  accurate  in  time. 

Numerical  results  are  provided  to  validate  the  proposed  methodology  by  considering  plane  wave  scattering 
from  a  perfectly  conducting  sphere.  Such  an  example  provides  all  the  necessary  complexity  to  exercise  the 
algorithm.  Numerical  data  are  provided  that  show  the  progression  of  charge  density  on  the  sphere  and 
the  resulting  radar  cross-section  (RCS)  produced  by  the  sphere.  Data  comparisons  are  made  with  another 
method  that  employs  a  first-order  type  boundary  condition.  For  the  RCS  data,  the  theoretical  Mie  series  is 
used  for  benchmarking  purposes. 

Although  the  data  shows  that  the  scheme  works,  there  are  problems  with  the  scheme  in  terms  of  late-time 
instability.  The  causes  of  this  instability  are  currently  under  investigation  and  are  the  subject  of  a  future 
study. 


28-2 


DEVELOPMENT  OF  A  NEW  NUMERICAL 
BOUNDARY  CONDITION  FOR  PERFECT  CONDUCTORS 


Jeffrey  L.  Young 

1  Introduction 

Over  the  past  decades,  several  time-domain  methods  for  Maxwell’s  equations  have  appeared  in  the  literature. 
The  most  popular  is  the  finite-difference,  time-domain  (FDTD)  method  of  Yee  [1],[2].  The  power  of  this 
method  rests  on  its  algorithmic  simplicity  and  its  ability  to  discretize  Maxwell’s  equations  to  second-order. 
Although  the  method  has  been  used  in  many  diverse  electromagnetic  applications,  its  primary  shortcoming 
is  its  inability  to  form  a  grid  that  naturally  conforms  to  the  body  under  investigation. 

To  circumvent  this  problem,  Madsen  [3]  developed  a  modified  body-fitting  FDTD  method  that  preserves 
the  divergence  properties  of  the  electromagnetic  vectors.  Despite  its  use  of  the  dual  grid  and  its  problem  with 
late  time  instabilities,  it  has  been  shown  to  produce  accurate  results  in  several  applications.  Edge  elements 
have  recently  replaced  the  traditional  nodal-based  elements  as  the  elements  of  choice  in  the  weighted-residual 
paradigm  (see  Mahadevan  and  Mittra  [4]  for  a  review  of  the  edge-element  methodology).  These  too  admit 
no  spurious  numerical  charge  since  the  element  is  customized  to  satisfy  the  divergence  equations  of  Maxwell. 
As  with  the  method  of  Madsen  and  Yee,  the  unknowns  in  the  edge-element  algorithm  are  not  collocated. 
From  a  grid  generation  and  a  coding  perspective,  this  uncollocated  strategy  adds  significant  complexity  to 
the  problem. 

More  recently,  the  finite-volume  approach,  which  was  developed  for  the  fluid  dynamics  community,  has 
been  considered  for  electromagnetic-type  problems.  For  this  situation.  Maxwell’s  equations  are  couched  in 
strong  conservative  form  and  the  unknowns  are  collocated  at  the  cell  centers.  Flux  evaluations  at  the  cell 
walls  are  determined  from  an  interpolation  of  the  dependent  variables  at  cell  centers.  For  certain  types 
of  interpolative  schemes  and  flux-splitting  procedures,  the  algorithm  mimics  many  of  the  features  of  the 
hyperbolic  system  of  equations.  For  example,  when  the  fluxes  are  split  according  to  the  eigenvalues  of  the 
flux  Jacobian  matrix  and  windward-based  differencing  is  used  in  the  discretization,  the  algorithm  naturally 
captures  the  right-  and  left-  running  waves  within  their  domain  of  influence. 

The  main  contributors  to  the  finite-volume  strategy,  as  applied  to  Maxwell  equations,  are  Shankar  of 
Rockwell  [5,  6,  7]  and  Shang  of  the  Wright  Patterson  Air  Force  Base  [8,  9,  10].  The  method  of  Shankar 
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is  founded  on  the  upwind  Lax-Wendroff  method.  For  example,  when  applied  to  the  model  scalar  wave 
equation,  the  solution  u  at  time  n  +  1  and  node  j  is  derived  from: 

=  u]  +  .5i/(3u;  -  4u"_i  +  u”_2)  +  -  2u;  +  u,_i), 

where  v  =  c6tf6x.  When  applied  to  higher  dimensions,  Shankar  decomposes  the  dependent  variable  in  terms 
of  two  states,  from  which  he  derives  the  flux  at  the  cell  interface.  Shang  also  decomposes  the  dependent 
variable  in  terms  of  two  states,  but  the  flux  evaluations  are  obtained  from  a  flux-splitting  procedure  and 
the  modified  MUSCL  algorithm  [11]  (sometimes  referred  to  as  the  /c-scheme).  By  suitably  selecting  the 
appropriate  values  for  k  and  the  limiter  0,  the  algorithm  can  take  on  one  of  five  different  spatial  discretization 
schemes:  first-order  windward,  second-order  windward,  second-order  Fromm,  third-order  windward  biased 
or  second-order  central  differenced. 

Unfortunately,  there  exist  inherent  difficulties  in  maintaining  high-order  accuracy  when  these  methods  are 
applied  near  and  on  perfect  conductors.  Two  fundamental  problems  prevent  one  from  establishing  anything 
more  than  a  first-order  approximation  to  the  boundary  conditions.  First,  all  field  components  (magnetic 
and  electric)  share  the  same  point  in  space.  If  that  point  rests  on  the  perfect  conductor,  then  only  the  values 
of  three  of  the  six  components  are  known  a  priori  on  the  conductor;  these  values  are  tangential  electric 
field  and  normal  magnetic  field.  The  other  three,  normal  electric  field  and  tangential  magnetic  field  are  not 
known  since  the  former  is  proportional  to  the  induced  surface  charge  and  the  latter  is  proportional  to  the 
induced  surface  current  density.  To  deduce  the  values  for  the  charge  and  the  current,  one  possible  approach 
to  is  to  extrapolate  the  normal  electric  field  and  tangential  magnetic  field  from  interior  values.  Second,  when 
second-order  (or  greater)  windward  differencing  is  used,  at  least  two  cell  values  on  either  side  of  the  flux 
wall  are  required  to  reconstruct  the  flux  on  the  wall.  For  cell  walls  one  cell  away  from  the  conductor,  one  of 
these  cell  values  lies  in  the  interior  of  the  conductor,  which  is  technically  outside  the  computational  domain. 
This  being  the  case,  a  first-order  approximation  is  required. 

In  this  report,  we  present  a  new  strategy  that  is  locally  and  globally  third-order  accurate  in  space. 
This  is  achieved  by  reconsidering  and  manipulating  Maxwell’s  equations  in  the  vicinity  of  the  conductor. 
Particularly,  the  result  of  this  analysis  is  a  set  of  two  time-domain  equations  that  explicitly  show  the 
interelationship  between  surface  charge  density,  surface  current  density  and  tangential  electric  field.  As 
might  be  expected,  the  interrelationship  is  quite  similar  to  Euler’s  linearized,  inviscid  acoustic  equations, 
with  the  tangential  field  acting  like  a  source  of  “current  momentum.” 

The  numerical  procedure  rests  upon  the  simultaneous  discretization  of  both  Maxwell’s  and  Euler’s  equa- 
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tions.  In  this  report  the  discretization  will  be  accomplished  via  the  MUSCL  procedure  in  generalized  curvilin¬ 
ear  coordinates.  To  advance  the  equations  in  time,  the  second-order  Runge-Kutta  (RK)  scheme  is  employed. 
For  sake  of  completeness,  the  scheme  is  numerically  validated  by  considering  the  scattering  of  a  TEM  plane 
wave  from  a  perfectly  conducting  sphere.  To  this  end,  plots  of  the  sphere’s  radar  cross-section  (RCS)  and 
the  surface  charge  are  provided. 

2  Governing  Equations 

Consider  Maxwell’s  curl  equations  for  linear,  isotropic,  homogeneous  media: 

=  -V  X  E  (1) 


e^  =  VxH,  (2) 

at 

where  E  and  H  are  the  electric  and  magnetic  fields,  respectively;  e  and  //  are  the  permittivity  and  perme¬ 
ability,  respectively.  Let  n  be  a  constant  unit  vector  normal  to  and  pointing  out  from  the  perfect  conductor. 
Then,  by  taking  the  scalar  product  of  n  with  Ampere’s  law,  we  obtain 

en.^  =  V(Hxn)-f-H-(Vxn),  (3) 

ot 

where  it  is  understood  that  the  above  equation  is  being  considered  at  an  infinitesimal  distance  from  the 
prefect  conductor.  Since  the  curl  of  a  constant  vector  is  zero,  n  x  H  is  the  surface  current  J,,  and  en  •  E  is 
the  surface  charge  p, ,  Eqn.  (3)  is  equivalent  to 

^  +  V-J,  =  0.  (4) 

Cn 

As  expected,  the  resultant  equation  is  simply  a  statement  that  charge  must  be  conserved  on  a  perfect 
conductor.  By  way  of  comparison,  we  observe  the  analogous  relationship  between  the  acoustic  quantities  of 
fluid  velocity  and  mass  density  with  the  electromagnetic  quantities  of  current  density  and  charge  density, 
respectively. 

The  previous  derivation  is  duplicated  by  forming  the  vector  product  of  n  with  Faraday’s  law: 

»n  X  ^  =  -V(n  •  E)  -I-  (n  •  V)E  -I-  (E  ■  V)n  +  E  x  (V  x  n)  (5) 

at 

Of  course,  the  last  two  terms  on  the  left-hand  side  are  zero  due  to  the  requirement  that  n  be  a  constant, 
More  importantly,  the  remaining  term  on  the  right-hand  side  is  nothing  more  than  the  rate  of  change  of 
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surface  current  and  the  first  term  on  the  left-hand  side  is  proportional  to  the  gradient  of  charge  density. 


Thus,  we  are  left  with 


+  =  (n  •  V)E/p, 


where  v  =  l/^/pe  is  the  phase  velocity  of  the  homogeneous  medium.  The  vector  senses  of  J,  and  Vp,  are 
tangentially  directed  to  the  conductor’s  surface,  which  implies  that  the  source  term  must  also  be  directed 


similarly.  Hence, 


dJ,  ,  1  dEt 


where  E*  is  the  tangential  component  of  E.  Here  we  see  the  analogous  momentum  conservation  equation 
whose  source  term  is  the  normal  derivative  of  tangential  E. 

To  summarize,  the  governing  equations  for  the  domain  variables  E  and  H  are  stated  in  Maxwell’s  equa¬ 
tions,  (1)  and  (2).  The  governing  equations  for  the  boundary  variables  J,  and  p,  are  stated  in  Euler’s 
equations,  (4)  and  (7).  Finally,  the  system  is  made  complete  via  the  boundary  conditions: 


n  X  E  =  0 


n-E  =  pje 


n  X  H  =  J, 


n-H  =  0.  (11) 

These  systems  of  equations  (Maxwell,  Euler  and  boundary  conditions)  allow  one  to  compute  both  bound¬ 
ary  variables  and  domain  variables  simultaneously.  Consequently,  a  numerical  procedure  can  be  developed 
to  utilize  that  fact,  thus  circumventing  the  problem  of  not  knowing  the  surface  current  and  charge  densities 
on  the  conductor  a  priori.  Two  other  comments  are  in  order.  First,  it  is  well  known  that  tangential  current 
leads  to  electromagnetic  radiation.  By  treating  J,  as  an  dependent  variable  to  be  computed,  we  see  that 
the  derived  Euler’s  equations  have  a  primary  importance  of  computing  J,  and  that  Maxwell’s  equations  are 
needed  only  for  the  deduction  of  the  source  field  in  the  analogous  momentum  equation.  Second,  the  source 
term  in  the  momentum  equation  is  easily  discretized  by  noting  that  the  node  that  resides  on  the  conductor 
is  zero,  since  tangential  E  must  vanish  on  the  conductor. 
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2.1  Conservative  Form 

We  begin  this  treatment  by  couching  Maxwell’s  equations  in  the  following  conservative  form: 

^  +  V  •  (F(tf))  =  0,  (12) 

where  U  is  the  solution  vector  and  F  is  the  point-flux  tensor,  which  is  dependent  upon  U.  In  expanded 
form, 

+  ^  +  ^  +  =  (13) 

dt  ^  dx  dy  dz 

where  F.,Fy,F.  are  the  flux  components  of  F.  Since  the  the  fluxes  are  homogeneous  functions  of  degree 
one,  we  may  write  F.  =  AU,  Fy  =  BU  and  F.  =  CU.  Here  A,  B,  C  are  the  Jacobian  matrices  associated 

with  Fx,  Fy,F^,  respectively.  Thus,  Equation  (13)  is  equivalent  to 

dU  d{AU)  djBU)  a(CT0_n  (14) 

dt  dx  dy  dz 

For  homogeneous,  isotropic  media,  it  is  a  simple  exercise  to  show  that  the  flux  Jacobian  matrices  associ¬ 
ated  with  Maxwell’s  equations  are  given  by 

•  0  0  0  0  0  0  ■ 

0  0  0  0  0  -1 

0  0  0  0  1  0  (15) 

^=000000’ 

0  0  i  0  0  0 

0  -i  0  0  0  0 

■  0  0  0  0  0  7  ■ 

0  0  0  0  0  0 

0  0  0  -i  0  0  (16) 

B=00-i  000’ 

0  0  0^  0  0  0 

i  0  0  0  0  0 

L  H  -* 

r  0  0  0  0  -7  0  ■ 

0  0  0  ^00 

0  0  0  0  0  0  (17) 

C=0-i  0000’ 

i  0^  0  0  0  0 

0  0  0  0  0  0. 


where 


U  =  [Br,  By,  B^,D^,Dy,D,y. 


Hence, 


Fx  =  [0,-Dj€,Dy/€,0,B^/fi,-By/y.y 
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=  [DJe,0,-Dx/€,-B^/fi,0,Bj:/nY 


(20) 


=  [-Dy/€,  Dx/€,  0,  By/n,  -Br/n,  0]*.  (21) 

Note;  From  the  constitutive  relationships,  B  =  /iH  and  D  =  eE. 

Except  for  the  inclusion  of  the  source  term  (i.e.,  the  normal  derivative  of  tangential  E),  the  same  repre¬ 
sentations  given  by  Eqn.  (13)  or  (14)  may  be  used  for  Euler’s  equations.  Particularly,  for  the  flux  Jacobian 
matrices. 


for  the  dependent  variables, 


From  these  definitions  it  follows  that 
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Fi  = 


2.2  Flux  Splitting 


The  flux  Jacobian  matrix  A  (as  well  as  B  and  C)  associated  with  both  systems  is  diagonalizable  since  its 
eigenvectors  are  linearly  independent.  That  is,  an  alternative  representation  for  A  is 

A  =  SxAxS^  ^  (29) 

where  Sx  is  the  modal  matrix  of  A  and  A*  is  its  spectral  matrix,  which  is  a  diagonal  matrix.  A  detailed 
eigenvector  analysis  reveals  for  Maxwell’s  equations  that 

Ax  =  Diag  {-A, -A,  A,  A, 0, 0},  (30) 
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where  A  = 

Too  0  0  0  1 

T}  0  —T]  0  0  0 

_  0-7?  0  7?00 

-  0  0  0  0  1  0 

0  1  0  10  0 

10  10  0  0 

with  7?  =  y/jije- 

The  flux  splitting  procedure  is  based  upon  the  decomposition  of  the  spectral  matrix  in  terms  of  the  sign 
of  its  members; 

Aj;  =  A+  +  AJ  (32) 


where 


It  follows  from  Eqn.  (29)  that 


A^  =  Diag  {0, 0,  A,  A,  0, 0} 


Aj.  =  Diag  {-A, -A,  0, 0,0,0}. 


A  =  Sx(A+  +  A-)S-^  =A+  +  A- 


Here  A+  =  S.A+S-^  and  A+  =  5.Ar5;^  Since  F,  =  AU,  then  F,  =  (A+  +  A')!/  =  F+  +  F' ,  where 
F+  =i  A'^U  and  F~  =  A~U.  Carrying  out  the  required  matrix  operations,  we  discover  that 


F*  =  I  [O,  (b,A  ^)  .  (b,X  +  ^)  ,0,  (^ 


+  A  I  ,  (  - -  DzX 


F.-  =  1  [o,  -  (b.A  +  ^)  ,  (-B,A  +  i)  ,0,  (^  -  D,a)  +  D.x)] 

By  way  of  a  similar  analysis,  the  split  fluxes  for  Euler’s  equations  are  found  to  be 

Fr  =  I  0]‘ 


F.-  =  \  [-vp,  +  Jr,  0]‘ .  (39) 

Regardless  of  the  governing  equations  to  be  considered,  the  duplication  of  the  flux  splitting  procedure 
for  the  other  fluxes  yields  the  expanded  form  for  the  conservative  equation: 

dU  5F+  dF~  dF^  dFy  dF^  dF^  _ 

dt^dx'^dx'^dy  dy  dz  dz 

The  previous  result  is  the  equation  on  which  characteristic  theory  is  based.  A  simple  discretization  scheme 
is  given  next  to  demonstrate  the  basic  concepts  of  the  numerical  theory. 
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2.3  A  Simple  Difference  Scheme 


The  advantage  of  writing  Maxwell’s  or  Euler’s  equations  in  characteristic  form,  as  manifested  in  Eqn.  (40), 
is  found  in  the  differencing  scheme  of  the  spatial  derivatives.  Since  Fj"  corresponds  to  eigenvalues  that  are 
positive,  which  leads  to  positive  going  waves  in  the  x-direction  in  a  one-dimensional  analysis,  it  makes  sense 
to  apply  a  windward  differencing  scheme  to  f +.  For  second-order  accuracy,  we  may  use 


dF+ 


dx 


3F;+  -  AF,ti  +  Fits 


25. 


(41) 


Hence,  it  is  readily  seen  that  the  wave  is  discretized  solely  in  terms  of  its  values  within  its  domain  of  influence. 
Similarly,  F~  is  associated  with  waves  that  are  traveling  in  the  negative  x-direction.  Hence,  we  let 


dF: 


dx 


ZF-  4F,.^^  +  F.^.2 

26r 


(42) 


Note:  The  use  of  windward  differencing  will  introduce  a  certain  amount  of  numerical  dissipation  and  disper¬ 
sion  into  the  solution. 

For  certain  classes  of  temporal  integrators,  such  as  the  Runge-Kutta  integrators,  the  previous  scheme 
is  found  to  be  conditionally  stable.  Formally,  the  RK  methods  may  be  expressed  in  terms  of  a  truncated 
Taylor  series  for  the  matrix  where  A  is  the  spatial  discretization  matrix: 

=  (43) 

^  m! 
m=l 


Obviously,  the  implementation  of  this  series  is  straightforward  and  any  order  of  accuracy  can  be  achieved, 
if  one  is  willing  to  expend  the  computational  resources  to  achieve  that  accuracy.  However,  to  minimize 
the  number  of  evaluations  of  the  right-hand  side  of  Eqn.  (43)  and  to  limit  the  storage  required  for  each 
evaluation,  the  second-order  and  fourth-order  schemes  are  used.  For  second-order  accuracy. 


^0  —  ^tF(/o) 

ki  =  6tR{fi) 

f+i  =  r  +  h.  (44) 

Here  R  is  the  residual;  fo  =  /(x,<o)  and  /i  =  /<>  -f  k^/2.  For  fourth-order  accuracy, 

ko  =  6tR{fo) 
ki  =  StRih) 
k2  =  StR{f2) 
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kz  —  6tR(fz) 

f+i  =  r  +  {ki  +  2k2-k2k3  +  k4)/6.  (45) 


Here  R  is  the  residual;  /o  =  fix,U),  fi=fo  +  k,/2,  h  =  fi  +  ki/2  and  /a  =  /2  +  kz-  A  detailed  Fourier 
analysis  shows  that  the  second-order,  fully  upwind,  two-stage  Runge-Kutta  scheme  is  conditionally  stable 

with  a  CFL=.5;  for  the  four-stage  scheme,  CFL=.695. 

To  apply  the  flux-splitting  procedure  to  a  grid  that  is  non-Cartesian,  additional  analytical  work  is 

required.  This  subject  is  studied  next. 

2.4  Strong  Conservative  Form 


To  cast  the  conservative  equation  into  strong  conservative  form,  we  map  the  Cartesian  independent  variables 
x,y,z  into  a  generalized  coordinate  system  ^,77,C;  the  dependent  variables  are  still  expressed  in  terms  of 
the  Cartesian  frame.  Hence,  consider  the  following  coordinate  transformation:  ^  =  ^(x,y,z),  t)  =  T]{x,y,z) 
and  C  =  or  similarly,  for  one-to-one  transformations,  x  =  V  =  y{^,vX)  and  z  = 

z{^,r},0-  Associated  with  this  transformation  is  a  set  of  metrics  (e.g.,  x„,  that  convey  the  geometry  of 


the  transformation. 

After  a  sequence  of  mathematical  operations  that  involve  the  metrics  of  the  transformation,  Eqn.  (13) 


becomes 

dF  dG  dH 


(46) 


where  U  is  the  unknown  domain  vector  and  F,G,  and  H  are  the  domain  fluxes  in  the  and  C  directions, 
respectively.  With  V  denoting  the  Jacobian,  the  fluxes,  U  and  J  take  on  the  following  meaning: 


II 

(47) 

(48) 

G  =  {j}xFx  d"  'Hy^y  d*  t)!^! 

(49) 

H  =  {^xFx  +  <:yFy-k(:zFx)V. 

(50) 

A  similar  representation  holds  for  Euler’s  equations,  except  that  one  term  in  the  strong  conservative  equation 
may  be  omitted,  since  current  flow  is  restricted  to  a  locally  two-dimensional  surface. 
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3  Finite-Volume  Procedure 


The  discretization  of  the  conservative  equation  is  accomplished  via  a  cell-centered,  finite-volume  approach 
in  conjunction  with  the  Runge-Kutta  two-stage  integrator.  By  following  the  lead  of  Steger  and  Warming 
[12],  we  reconsider  the  split  fluxes  (e.g.,  F+  and  F~).  These  fluxes  are  associated  with  the  right  and  left 
running  waves  at  the  cell  interface;  the  states  and  are  constructed  from  known  values  of  U  in  adjacent 
cells.  For  example,  the  /c-scheme  [11]  requires  that  at  interface  i  +  1(2, 

ut+m-  =  Ui  +  j[(i  -  «)v  +  (1  +  K)A]Ui, 

and 

Ufli/2  =  ^>+1  -  j[(l  +  +  (1  -  «)A]t^.+i, 

where  0  is  a  limiter  {(f)  =  0, 1),  k  is  an  accuracy  parameter,  Vf/,-  =  Ui  —  Ui-\  and  AUi  =  Ui+\  —  Ui.  When  <i> 
equals  unity  and  k  takes  on  a  value  of— 1,1/3  or +1,  the  scheme  is  deemed  second-order  windward,  third- 
order  windward-biased  or  second-order  central  differenced,  respectively;  the  second-order  Fromm  scheme 
is  recovered  when  <j>  ■=  I  and  k  =  0.  Thus,  one  can  see  that  the  advantage  of  the  /c-scheme  is  found  in 
its  ability  to  capture  the  specific  physics  of  the  problem  at  hand.  Although  third-order  accuracy  is  desired 
in  most  situations,  a  second-order  windward  scheme  has  the  advantage  of  predicting  the  slope  of  the  field 
at  a  dielectric  interface  from  field  values  totally  resident  in  the  dielectric.  In  contrast,  a  central  difference 
approximation  will  lead  to  boundary  errors  since  the  prediction  of  the  dependent  variable  requires  knowledge 
of  the  dependent  variable  on  both  sides  of  the  dielectric  interface;  for  large  disparities  between  the  interfacial 
permittivities,  the  discontinuities  in  either  normal  E  or  tangential  D  will  not  be  captured.  To  predict  these 
discontinuities  correctly,  two  options  exist:  1)  use  a  fully  windward  scheme,  which  will  require  a  smaller 
time  step,  as  noted  from  the  stability  analysis  [13]  or  2)  set  (f>  to  a.  value  of  zero.  For  this  latter  case,  a 
certain  amount  of  numerical  dissipation  will  be  introduced  into  the  solution  due  to  the  resulting  first-order 
approximation. 

Once  the  left  and  right  states  of  U  are  estimated  at  the  i  +  1/2  interface,  the  flux  crossing  that  surface 
is  simply, 

fx..+i/2  =  +  F-{UfL,„). 

Similarly,  at  interfaces  j  and  k: 

=  ^’;(t^Al/2)  +  FiiUjl,,,) 
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and 


Fz,k  +  \I2  =  (t^f-l/2)- 

For  generalized  coordinates,  the  flux,  say  F,  is  split  in  the  direction  of  e.  This  is  accomplished  by  a  local 
transformation  matrix  T  and  by  defining  a  new  flux  F:  F  =  TF.  Since  T  can  be  chosen  such  that  F 
and  F  have  the  same  functional  form,  the  eigenvalue/eigenvector  information  is  directly  obtained  from  the 
Cartesian  formulation.  Thus,  if  F  =  F+  +  F"  then  F  =  F+  +  F',  where  F+  =  T-^F+  and  F' =T-^F-. 
This  same  procedure  is  repeated  in  the  directions  tj  and  C,  for  G  and  H,  respectively. 

To  maintain  stability,  the  two-stage  Runge-Kutta  integrator  is  invoked,  as  defined  earlier.  Although  other 
temporal  schemes  may  be  used,  the  RK  integrator  is  simple  to  code  and  to  vectorize  for  supercomputing 

platforms. 

4  Numerical  Results 

To  demonstrate  some  of  the  previous  concepts,  the  problem  of  a  plane  wave  scattering  from  a  perfectly 
conducting  sphere  is  considered.  The  computational  grid  is  constructed  by  employing  the  Cartesian  to 
Spherical  coordinate  mapping  functions.  This  mapping  results  in  a  structured  grid  whose  indices  I,J,K 
denote  the  number  of  grid  lines  in  the  directions,  respectively.  The  sphere  is  illuminated  from  above 

by  a  TEM  gaussian  plane  wave  of  the  form  where  it  has  been  assumed  that  the  surrounding 

medium  has  unity  permittivity  and  permeability,  thus  rendering  the  phase  velocity  to  unity.  To  achieve  a 
usable  spectral  content  of  the  incident  wave  up  to  12  r/s,  w  is  set  to  3.96.  Finally,  a  CFL  number  of  unity 

has  experimentally  been  proven  to  yield  stable  results  . 

Figures  1,  2  and  3  show  plots  of  the  charge  density  on  the  sphere  as  a  function  of  elevation  angle  after 

150,  300  and  450  time  steps,  respectively,  for  the  present  situation  Afl  =  0.05  m,  (/,  J,  A)  —  (30,30,54) 
and  0  =  0.  The  data  associated  with  one  curve  is  obtained  via  a  first-order  boundary  condition  (by  first- 
order,  we  mean  that  the  unknowns  that  exist  on  the  spherical  surface  are  extrapolated  from  the  interior  via 
a  first-order  shift  from  the  nearest  cell);  the  other  curve  is  associated  with  the  exact  boundary  condition 
described  in  this  report.  These  plots  serve  to  validate  the  theory  of  the  exact  boundary  condition.  That  is, 
with  these  plots  confidence  is  gained  from  numerical  experimentation  that  the  charge  density  indeed  satisfies 

the  linearized  inviscid  Euler  equations. 

Using  this  same  coarse  grid  and  1000  time  steps,  the  RCS  of  the  sphere  is  computed.  Consider  Figures 

iTechnicaUy,  the  CFL  number  should  be  set  to  a  value  no  greater  than  0.87  [13].  However,  due  to  the  way  the  time  step  is 
calculated  from  the  grid  geometry,  a  CFL  of  1.00  still  results  in  stable  data. 
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Figure  1:  Charge  density  as  a  function  of  elevation  angle  after  150  time  steps;  (p  =  0. 
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Figure  2;  Charge  density  as  a  function  of  elevation  angle  after  300  time  steps;  ^  =  0. 
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Figure  3:  Charge  density  as  a  function  of  elevation  angle  after  450  time  steps;  (j)  -  0. 

4  and  5,  which  show  the  RCS  when  <6  =  0  and  =  90®,  respectively;  also  shown  on  this  plot  is  the 
theoretical  data  obtained  from  the  Mie  series.  All  data  corresponds  to  a  sphere  whose  electrical  radius, 
ka,  is  6.0.  To  obtain  this  data,  Maxwell’s  equations  are  cast  in  terms  of  scattered  field  quantities.  To 
generate  frequency-domain  information  from  the  tim^domain  data,  a  processing  algorithm  is  required  that 
computes  the  running  Fourier  transform.  The  RCS  data  is  obtained  by  invoking  Schelkunoff’s  principle  of 
equivalent  sources.  Since  the  finite-volume  formulation  requires  the  knowledge  of  surface  cell  areas  in  the 
flux  evaluations,  the  computation  of  the  Schelkunoff  surface  integrals  is  reduced  to  a  sum  of  these  cell  areas 
weighted  by  the  integrand  at  the  cell  center. 

Obviously,  all  RCS  solutions  tend  to  follow  the  general  undulations  of  the  theoretical  data  and  capture 
the  correct  value  for  the  forward-scattered  RCS.  Unfortunately,  the  scheme  that  uses  the  exact  boundary 
condition  implementation  appears  to  deviate  more  from  the  theoretical  solution  in  the  backscattered  region. 

Numerical  simulation  has  proven  that  the  derived  Euler  equations  indeed  predict  the  surface  charge  and 
current  on  a  perfectly  conducting  surface.  However,  after  numerous  simulations,  we  have  found  that  the 
algorithm  is  subject  to  late-time  instabilities.  For  example,  when  the  same  simulation  that  was  used  to 
create  the  previous  RCS  data  is  increased  to  2000  time  steps,  the  data  no  longer  showed  any  trend  towards 
convergence.  The  reasons  for  this  remain  undiscovered.  Two  possibilities  have  been  identified;  1)  The 
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Figure  4;  RCS  of  a  sphere  when  ka  =  6  and  =  0. 
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Figure  5:  RCS  of  a  sphere  when  ka  =  6  and  <j)  =  90®. 
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source  term  in  Euler’s  equations  is  not  truly  a  source  term  but  a  field  term  that  is  computed  from  Maxwell  s 
equations.  Hence,  the  way  this  term  is  computed  can  effect  stability.  For  the  present  implementation,  one¬ 
sided  differencing  that  is  second-order  accurate  is  used.  2)  Currently,  the  boundary  conditions  are  applied 
at  the  cell  centers  rather  than  at  the  cell  walls.  It  has  been  suggested  that  such  an  approach  reduces  the 
accuracy  of  the  scheme,  thereby  inducing  possible  instabilities. 

5  Conclusion 

A  formal  treatment  for  extending  Maxwell’s  equations  to  the  boundary  has  been  given.  Based  upon  prelim¬ 
inary  numerical  experimentations,  these  new  equations,  identified  as  Euler’s  equations,  indeed  predict  the 
space-time  profile  of  the  surface  charge  and  surface  current.  Unfortunately,  when  cast  in  terms  of  a  finite 
volume  procedure,  various  simulations  have  shown  that  the  data  may  become  unstable  for  a  highly  resolved 
grid  or  for  late-time  simulations.  The  reasons  for  this  instability  is  the  subject  of  future  work. 

Although  future  work  should  include  an  analytical  treatment  of  the  Neumann  stability  properties,  the 
treatment  is  not  trivial.  First,  most  stability  analyses  start  by  considering  a  one-dimensional  domain  with 
periodic  boundary  conditions.  For  our  situation,  the  domain  is  not  periodic  but  is  terminated  by  a  perfect 
conductor.  Such  a  termination  greatly  complicates  the  analysis.  Moreover,  a  one-dimensional  plane  wave  is 
by  definition  transverse  to  the  perfect  conductor  -  hence,  it  cannot  induce  any  surface  charge.  Thus,  Euler  s 
equations  are  reduced  to  a  single  equation  for  which  the  time-rate-of-change  of  the  current  density  equals 

the  normal  derivative  of  the  tangential  electric  field. 

Other  future  work  includes  1)  the  incorporation  of  the  boundary  conditions  at  flux  walls  rather  than  at 
cell  centers  and  2)  simplifying  the  problem  such  that  the  scatterer’s  geometry  coincides  with  a  Cartesian 
grid.  This  latter  study  would  identify  whether  or  not  the  curvilinear  coordinate  transformation  has  any 
impact  on  numerical  instability. 

Finally,  in  this  project  the  finite-volume  code  that  incorporates  either  the  first-order  boundary  condition 
or  the  Euler-type  boundary  condition  is  streamlined  for  robust  performance.  The  code  is  vectorized  to 
accommodate  the  vector  nature  of  the  Cray  machines.  In  addition,  the  code  is  customized  for  low  memory 
usage.  For  example,  for  a  grid  size  of  (50,55,104)  and  for  a  simulation  time  of  4000  time  steps,  the  code 
requires  only  15  Megawords  of  memory  and  2.7  cpu  hours  on  a  Cray  YMP;  on  Cray  C90,  1.3  cpu  hours 
are  needed.  This  effort  in  improving  the  code  performance  will  benefit  other  DoD  projects  currently  being 
managed  by  the  PI. 
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Abstract 

In  this  research  a  systematic  design  procedure  for  multivariable  nonlinear  tracking  and 
output/state  decoupling  control  is  developed  by  way  of  linearization  along  a  nominal  trajectory. 
The  resulting  linear  time-varying  (LTV)  tracking  error  dynamics  are  then  stabilized  and  decoupled 
using  PD-eigenstructure  assignment  in  a  way  similar  to  the  eigenstructure  assignment  design  for 
LTI  systems.  Main  Accomplishments  of  this  research  include:  (i)  extension  of  the  PD-spectrum 
and  PD-eigenvector  concepts  for  scalar  polynomial  differential  operator  to  vector  differential 
polynomial  operators,  (ii)  extension  of  Silverman-Wolovich  (S-W)  transformation  to  the  entire 
class  of  uniformly  completely  controllable  (u.c.c.)  multivariable  (MV)  LTV  systems,  (iii)  PD- 
eigenvector  based  criteria  for  uniform  controllability  and  observability  of  MV  LTV  systems,  (iv) 
stabilization  of  u.c.c.  MV  LTV  systems  by  PD-spectrum  assignment,  (v)  output/state  decoupling 
of  u.c.c.  MV  LTV  systems  by  PD-eigenstructure  assignment,  and  (vi)  A  BTT  autopilot  with 
decoupled  roll-yaw  dynamics  using  the  PD-eigenstructure  assignment  control.  Due  to  the  time 
constraint,  implementation  and  simulation  results  are  not  available  at  the  present.  Further 
research  is  needed  to  address  the  complexity  of  the  implementation,  and  to  validation  the  theory 
and  design  procedure  by  simulations.  A  multiobjective  PD-eigenstructure  assignment  concept  is 
also  proposed  to  address  an  array  of  challenges  posed  by  modern  missile  technology.  Additional 
applications  of  the  results  of  this  research  can  be  found  is  aircraft  flight  control,  spacecraft  altitude 
control,  vibration  control,  robotics  and  the  like. 
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1.  Introduction 

Due  to  the  ever  stringent  performance  requirements  and  inherently  nonlinear,  time-varying 
and  highly  coupled  aerodynamics  [1],  modern  missile  control,  e.g.  the  bank-to-tum  (BTT)  missile 
autopilot,  poses  a  challenge  to  control  design  that  has  not  been  successfully  met  [2],  Despite  its 
well-know  limitations,  gain  scheduling  (GS)  appears  to  be  the  focus  of  the  current  prominent 
research  efforts  [3],  [4],  [5],  Because  of  the  lack  of  mathematical  theory  for  time-varying 
dynamics,  the  GS  approach  treats  the  nonlinear,  time-varying  missile  dynamics  as  time-invariant 
dynamics  linearized  at  discrete  operating  states.  An  autopilot  is  then  comprised  of  a  series  of 
linear  time-invariant  (LTI)  controllers  scheduled  for  these  frozen  operating  states  and  frozen 
time  dynamics  using  various  LTI  control  design  techniques,  such  as  eigenstructure  assignment  for 
guidance  command  tracking  and  roll-yaw  decoupling  [3], 

Scheduling  of  frozen-time,  frozen-state  controllers  for  fast  time-varying  dynamics  is  known 
to  be  mathematically  fallacious,  and  practically  hazardous  [4],  Recent  research  efforts  have  been 
directed  towards  applying  robust  control  techniques  to  extend  the  stability  margin  of  GS 
controllers  [3],  [5].  While  the  stability  margin  at  each  frozen-state  is  greatly  improved  with 
modem  robust  controllers,  it  does  not  seem  to  benefit  stability  margin  of  the  overall  system 
proportionally.  Indeed,  failures  have  been  reported  in  these  recent  attempts  [5;  pp.  14-15].  It 
appears  that  the  GS  control  technique  has  been  stretched  to  its  limit  in  coping  with  fast  time- 
varying  dynamics. 

In  addition  to  the  time-varying  stabilization  problem,  guidance  command  tracking  with  a 
nonlinear,  time-varying  airframe  poses  other  challenges  in  missile  autopilot  design.  In  particular. 
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normal  acceleration  tracking  with  a  tail  fin  controlled  missile  is  known  to  be  nonminimum  phase, 
and  the  roll-yaw  aerodynamics  are  highly  coupled  for  a  BTT  missile.  Actuator  saturation  and 
suppression  of  unmodeled  structural  modes  are  other  practical  issues  that  have  to  be  effectively 
dealt  with  by  an  autopilot.  These  problems  become  even  more  difficult  to  tackle  in  the  presence 
of  fast  time-varying  system  parameters. 

In  this  research,  a  recently  developed  differential-algebraic  spectral  theory  for  linear  time- 
varying  (LTV)  systems  is  applied  to  nonlinear  tracking  control,  such  as  the  BTT  missile  autopilot 
design  problem.  Specifically,  the  notions  of  Parallel  D-eigenvalues  (PD-eigenvalues)  and  Parallel 
D-eigenvectors  (PD-eigenvectors)  [6]  are  used  to  extend  existing  results  [3]  on  robust  GS 
eigenstructure  assignment  command  tracking  and  roll-yaw  decoupling  controllers.  The 
differential-algebraic  spectral  theory  treats  time-varying  dynamics  as  such,  therefore  it  would 
succeed  where  the  conventional  frozen-time,  frozen-state  GS  techniques  fail. 

Recently,  some  encouraging  preliminary  results  on  a  pitch  autopilot  design  based  on  this 
differential-algebraic  spectral  theory  have  been  obtained  [7],  [8].  The  design  approach  is  by 
linearization  of  the  nonlinear  airframe  along  a  nominal  (command)  trajectory  to  obtain  LTV 
tracking  error  dynamics.  This  design  approach  poses  two  technical  challenges:  (i)  implementation 
of  the  inverse  of  the  nonlinear  airframe  dynamics  to  generate  the  nominal  control,  and  (ii) 
exponential  stabilization  of  the  LTV  tracking  error  dynamics  to  achieve  guidance  command 
tracking.  The  first  problem  can  be  solved  by  employing  a  (dynamic)  neural  network,  which  is  not 
to  be  discussed  here  (c/  [8]  for  more  information).  The  tracking  error  stabilization  is  achieved  by 
assigning  the  extended-mean  (EM)  of  the  closed-loop  PD-eigenvalues  of  the  LTV  error  dynamics 
to  the  left-half-plane  (LHP)  of  the  complex  numbers  C,  in  a  way  very  similar  to  the  LTI 
eigenvalue  assignment. 

Simulation  studies  reported  in  [7],  [8]  showed  that  the  autopilot  was  capable  of  angle-of- 
attack  (AOA)  and  normal  acceleration  (NA)  tracking  of  various  command  trajectories  throughout 
the  entire  flight  envelope  without  explicit  scheduling  of  any  controller  parameters.  Figure  1.1 
below  shows  AOA  tracking  performance  of  the  autopilot  for  an  unrealistically  demanding 
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sinusoidal  trajectory  in  the  presence  of  all  combinations  of  ±  50%  variation  of  the  aerodynamic 
coefficients,  demonstrating  excellent  robustness.  It  is  noted  that  this  surprisingly  large  parametric 
stability  margin  was  not  specifically  designed  for,  but  a  consequence  of  the  trajectory  linearization 
and  the  correct  stability  criterion  based  on  PD-eigenvalues  for  LTV  systems,  as  opposed  to  the 
pointwise  linearization  and  frozen-time  stability  criterion  typical  of  the  GS  controllers. 


Figure  1 . 1  Tracking  Performance  and  Robustness; 
with  ±  50%  Variation  on  Aerodynamic  Coefficients  Cm,  Cn 


The  present  research  is  a  continuation  of  [7]  and  a  LTV  extension  of  [3].  It  is  believed  to  be 
the  first  attempt  at  decoupling  of  time-varying  dynamics  using  differential-algebraic  approach. 
The  design  method  of  [7]  is  based  on  the  differential-algebraic  spectral  theory  for  the  class  of  nth- 
order  scalar  LTV  dynamical  systems  of  the  form: 

y^^'>  H - a2{t)y  +  ai{t)y  =  0  (1.1) 

=  yjto  ,  fc  =  0,  l,---,n  -  1 
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which  can  be  conveniently  represented  as  =  0  using  the  scalar  polynomial  differential 

operator  (SPDO) 

=  <5"  +  H - (-  a2{t)6  +  ai(t)  (1.2) 


where  5  =  d/dt  is  the  derivative  operator.  It  is  well-known  that  the  subclass  ofLTI  systems  (1), 
where  ctk{t)  =  ak,  enjoys  an  algebraic  spectral  theory  that  facilitates  analytical  solutions,  precise 
stability  criteria,  frequency  domain  analysis  and  synthesis,  and  (robust)  stabilization  control  design 
techniques.  However,  as  is  also  well-known,  this  (time-invariant)  algebraic  spectral  theory  does 
not  carry  over,  in  general,  to  the  time-varying  case. 

The  differential-algebraic  spectral  theory  for  LTV  dynamic  systems  (1)  is  based  on  a 
classical  result  ofFloquet  (1879)  on  the  factorization  of  SPDO  [9],  [10] 

V^  =  {6-  K(t))---(6  -  A2(t))(6  -  Ai(i))  (1.3) 


In  the  differential-algebraic  spectral  theory,  a  collection  {Afc(0}Li  satisfying  (3)  is  called  a  series 
D-spectrum  {SD-spectrum)  for  and  an  n-parameter  family  {pk{t)  =  Ai,fc(f)}^^j  is  called  a 
parallel  D-spectrum  (PD-spectrum)  for  where  are  n  particular  solutions  for  Ai(t) 

satisfying  some  nonlinear  independence  constraints  (c/  Definition  2.3  below).  The  scalar 
functions  Xk(t)  and  pk{t)  are  called  SD-  and  PD-eigenvahes,  respectively,  for  (1.1)  and  (1.2)  [6], 
Let  Aft)  be  the  companion  matrix  associated  with 


Aft) 


0 

6 

0 

-aft) 


0  ••• 

0  l’ 
...  0 


-02  W 


0 

1 

(t) 


(1.4) 


The  matrix 


nt) 


Xft)  1 
0  X2{t) 


0 


0 

1 

0  Xn{t) 


(1.5) 
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is  called  a  Series  Spectral  canonical  form  (SS  canonical  form)  for  Pq  and  Aft).  The  diagonal 
matrix 

T{t)  =  diag[pi(t),p2(i),---,Pn(i)]  (1-6) 

is  called  a  Parallel  Spectral  canonical  form  (PS  canonical  form)  for  Pq  and  Aft).  Associated 
with  every  PS  canonical  matrix,  there  is  a  canonical  modal  matrix  given  by 


1 

1 

1 

Qp.(i) 

Q^(i)  ••• 

V(pi,  p2,  ...  ,  Pn)  = 

Q^(i) 

21(1) 

. 

srHi). 

where  Qp.  =  (6  +  pi),  Q^.  =  Qp,Qp,  It  is  noted  that  the  column  vectors  Vi(t)ofV(t)  satisfy 

Ac(t)vi(t)  -  pi(t)vi(t)  =  vft)  (1.8) 

and  the  row  vectors  (t)  of  U (t)  =  V~^  (t)  satisfy 

u](t)Aft)  -  pft)u]{t)  =  -itj  (1.9) 

Thus,  vft)  and  uf{t)  have  been  called  column  PD-eigenvectors  and  row  PD-eigenvectors, 
respectively,  of  Pq  and  Ac  associated  with  pft).  SD-eigenvectors  can  be  defined  similarly  [6], 

In  this  research,  the  concepts  of  SD-,  PD-eigenvalues,  and  SD-,  PD-eigenvectors  for  scalar 
LTV  systems  will  first  be  extended  to  the  class  of  n-dimensional,  /-input,  m-output  multivariable 
(MV)  LTV  systems  of  the  form 

X  =  A{t)x  +  B{t)u  (1-10) 

y  =  c(t)x  -|-  D(t)u 

The  following  basic  assumptions  are  made  throughout  this  report; 

(a)  The  parameter  matrices  A{t),  B(t),  C(t)  are  sufficiently  smooth  functions  of  time  which  are 
bounded,  and  have  bounded,  continuous  derivatives  up  to  (n  —  1)  times. 

(b)  The  matrix  D(t)  is  a  bounded  continuous  function  of  t. 

(c)  For  all  allowable  parameter  values,  rank  J5(t)  =  I  and  C(t)  =  m, . 
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(d)  The  pair  is  uniformly  completely  controllable  (u.c.c.)  for  all  allowable 

parameter  values. 

(e)  The  pair  {A{t),C(t)}  is  uniformly  completely  observable  (u.c.o.)  for  all  allowable 
parameter  values,  if  a  state  observer  is  needed. 

The  extension  is  by  way  of  Lyapunov  (coordinate)  transformations  as  outlined  in  [11;  Chapter  5], 
which  allows  the  Silverman-Wolovich  (S-W)  transformation  [12],  [13]  to  be  employed  for  the 
assignment  of  PD-eigenstructures.  As  an  incidental  result,  a  limitation  of  the  S-W  transformation 
is  removed  so  that  it  is  now  applicable  to  the  entire  class  of  u.c.c.  MV  LTV  systems.  PD- 
eigenvector  based  criteria  are  then  obtained  for  uniform  controllability  and  uniform  observability 
of  MV  LTV  systems,  which  are  natural  LTV  counterparts  of  the  well  known  Popov-Belevitch- 
Hautus  eigenvector  tests  for  controllability  and  observability  of  LTI  systems  [14].  Together  these 
results  allow  stabilization  and  output/state  decoupling  of  u.c.c.  MV  LTV  systems  by  PD- 
eigenstructure  assignment,  which  will  be  applied  to  a  BTT  missile  autopilot  design  using  an 
approach  parallel  to  that  of  [3]. 

The  extension  of  the  differential-algebraic  spectral  theory  is  presented  in  Section  2.  Section 
3  presents  the  extended  S-W  transformation.  The  PD-eigenvector  criteria  for  uniform 
controllability  and  uniform  observability  are  given  in  Section  4.  In  section  5,  the  extended  mean 
stability  criterion  for  scalar  LTV  systems  is  generalized  to  MV  LTV  systems,  and  stabilization  of 
u.c.c.  MV  LTV  systems  by  PD-eigenvalue  assignment  is  discussed.  Section  6  is  devoted  to 
output/state  decoupling  by  PD-eigenstructure  assignment.  The  main  results  of  Sections  2-6  are 
then  applied  in  Section  7  to  a  BTT  missile  autopilot  design.  The  report  is  concluded  with  Section 

8  containing  a  summary  of  the  main  results  and  suggestions  for  further  studies  along  this 
direction. 

2.  PD-eigenstructure  for  VPDO 

Let  K  be  the  differential  ring  of  C°°  functions  on  [0,oo).  Let  K"  be  the  n-dimensional 
differential  module  of  n-vectors  v{t)  =  col  [ui(t)],  and  be  the  differential  module  of  n  x  n 
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matrices  A{t)  =  [a^- (t)],  with  entries  Vi  and  Cij  from  K.  The  following  two  n-dimensional,  first- 
order,  mutually  adjoint  vector  polynomial  differential  operators  (VPDO) 

Va  =  5  -  A(t)  =  (2-1) 

and 

Qa  =  5  +  A^  (t)  =  'P(-A'^)  (2-2) 

play  an  instrumental  role  in  the  development  of  a  differential-algebraic  spectral  theory  for  both 

LTI  and  LTV  systems.  For  instance,  a  MV  LTV  system  (1.10)  can  be  represented  by 

Vax  =  B{t)u  (2.3) 

y  =  C{t)x  +  D{t)u 

Moreover,  if  we  define  the  inverse  VPDO  =  [61  -  as  the  integral  operator  such 

that  V^Va  =  I,  where  I  is  the  identity  operator,  then  the  output  y{t)  with  zero  initial 
conditions  can  be  conveniently  represented  by 

y(t)=[C(t)  [SI  -  A{t)]-'  B(t)  +  D(i)]  u(t)  (2.4) 

In  the  sequel,  we  shall  adopt  the  convention  that  I,  the  identity  operator,  and 

V^A  =  'Pa'P'a^  ■  The  same  applies  to  Qa-  Although  the  VPDOs  Va  and  Qa  are  defined  for  n- 
vectors  v  e  K”,  we  will  also  use  them  on  matrices  M  E  in  a  columnwise  fashion.  For 
n  —  1,  A(t)  becomes  a  scalar  function,  say  a{t),  and  the  VPDOs  Va  and  Qa  become  SPDOs 
denoted  by  Va  and  Qa,  respectively. 

A  set  of  n  vectors  h  ^  K”  is  called  a  uniform  basis  for  M’"  if  ldetL(t)l  >  6  for 

some  5  >  0,  for  all  t  >  0,  where  L{t)  =  [Ift)  \  kit)  |  •••  |  kit)].  A  uniform  basis  {h}k=i 
satisfying  llL(t)||  <  M  and  l|L(t)l|  <  M  for  some  M  <  oo,  for  all  f  >  0,  is  called  Lyapunov 
basis  for  R".  The  matrix  Lit)  is  called  the  coordinate  transformation  matrix  from  the  standard 
basis  {ek}k=\  for  to  the  basis  where  denotes  the  fcth  column  vector  of  the  identity 

matrix  I.  The  matrix  Lit)  is  called  a  Lyapunov  transformation  matrix  if  {lk}k=-i  is  a  Lyapunov 
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basis.  With  these  terminology,  the  PD-eigenvalue  and  PD-eigenvector  concepts  for  SPDO  are 
extended  to  VPDO  in  the  following  definition. 


Definition  2.1. 

(a)  A  continuously  differentiable  scalar  function  p{t)  is  called  a  PD-eigenvalue  of  an  n- 
dimensional  VPDO  Va  if  there  exists  a  Lyapunov  transformation  matrix  L{t)  such  that  the  vector 

p{t)  =  L{t)pQ{p{t))  (2.5) 

where 


Po(p(i))  = 


1 

Qp(l) 


(2.6) 


satisfies  V[A-pi]P  =  0,  or  what  is  the  same 

Pit)  =  [Ait)  -  pit)I]pit) 


(2.7) 


The  vector  pit)  is  then  called  a  PD-eigenvector  o^Va  associated  with  pit). 

(b)  Let  pit)  be  a  PD-eigenvalue  of  A  vector  g(t)  satisfying  Q[A-pi]q  =  0,  or  what  is  the 
same 


qit)  =  -[A(t)  -  pit)lYqit)  (2.8) 

is  called  an  adjoint  PD-eigenvector  o^Va  associated  with  pit). 

(c)  Let  pit)  and  g'(t)  be  a  PD-eigenvector  and  an  adjoint  PD-eigenvector  for  Va  associated 
with  a  PD-eigenvalue  pit).  Then  pit)  is  called  a  PD-eigenvalue  of  Ait).  The  vectors  pit)  and 
q^ it)  are  called  a  column  PD-eigenvector  and  a  row  PD-eigenvector,  respectively,  of  A(i) 
associated  with  pit). 


29-10 


Remarks. 

1.  Let  P{t)  =  {hit),  hit),  •  •  •,  ln{t)},  where  hit)  are  the  fcth  column  vector  of  Lit)  in  (a)  of 
Definition  2.1.  Then  /5(t)  constitutes  a  Lyapunov  basis  with  respect  to  which  the  PD- 
eigenvector  pit)  o^Va  can  be  represented  by  poipit)). 

2.  It  follows  from  (7)  and  (8)  that  if  pit)  and  qit)  are  column  and  row  PD-eigenvectors, 
respectively,  of  Ait)  associated  with  a  PD-eigenvalue  pit),  then  pit)  is  also  a  PD-eigenvalue 
of  —A'^it)  with  an  associated  column  PD-eigenvector  —git)  and  a  row  PD-eigenvector 

The  following  definition  introduces  the  notions  of  a  differentially  distinct  set.  This  notion  is 
subsequently  used  to  define  the  concept  of  a  PD-spectrum  of  a  VPDO. 

Definition  2.2. 

Let  {/3,(i)}iLi  be  a  set  of  k  PD-eigenvalues  of  Ait).  The  set  is  said  to  be  differentially 
distinct  if  the  associated  set  of  column  PD-eigenvectors  {piit)}\-i  is  linearly  independent. 

Remark. 

Being  in  a  set,  piit)  are  distinct  in  the  sense  that  pft)  ^  pft).  However,  they  are  not 
necessarily  differentially  distinct.  Consider,  for  example,  the  set  =  {~2’  2’  2(e*+\) } 

for  A  =  comp[l,  0.25,  —4].  The  associated  column  PD-eigenvectors  are 

'll  Til  [1 

1  1  e^-l 

Pl=“2i  P2  —  2  y  P3  =  2(e‘+l) 

1  i  1 

.  4  J  L  4  J  L  4 

which  are  clearly  linearly  dependent. 
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Definition  2.3. 

(a)  A  differentially  distinct  set  of  n  PD-eigenvalues  for  an  n-dimensional  VPDO 

Va  is  called  a  PD-spectrum  for  Va,  and  for  the  associated  nxn  matrix  A{t). 

(b)  A  PD-spectrum  for  an  n-dimensional  VPDO  Va  together  with  a  set  of 

associated  PD-eigenvectors  is  called  a  PD-eigenstructure  for  Va,  and  for  the 

associated  nxn  matrix  A{t). 

3.  Extension  of  Silverman-Wolovich  Transformation 

Using  the  VPDO  V a,  the  controllability  matrix  for  MV  LTV  system  (10)  can  be  written  as 

C{t)  =  [B{t)\VAB{t)\...\Vl-^B{t)] 

The  pair  {A(t),  B(t)}  is  u.c.c.  if  and  only  if  rankC(t)  =  n.  Thus,  if{A(t),  B(t)}  is  u.c.c.,  then 
for  any  fixed  t,  there  exists  indices  nj  {t),n2(t),  with  =  n  such  that 

rankP(t)  =  rank[Pi(t)  |  P2(t)  |  •••  |  Pi(t)]  =n 

where 

=  [Pjlii)  I  PP-{t)  I  •••  I  Pjnj{t){t)] 

with 

pMt)  =  v'x-%{t) 

where  bj(t)  is  the  jth  column  vector  of  B{t).  The  set  (3  =  {Pjkii)}  of  the  n  column  vectors  of 
P{t)  is  called  a  lexicographic  basis  at  t  for  R"  generated  by  {A{t),  B{t)},  and  the  set  of  indices 
{nj(t)}  is  called  a  set  of  lexicographic  indices.  A  u.c.c.  pair  of  {A{t),  B{t)}  is  said  to  have  a 
lexicographic  Lyapunov  basis  if  a  set  of  constant  lexicographic  indices  {n^}  can  be  chosen  for  all 
t. 

Let  /5  be  a  lexicographic  Lyapunov  basis.  Let  4  =  with  4  =  0,  where  Uj  are  the 

lexicographic  indices  for  /?.  Then  the  generic  multi-variable  phase  variable  (MVPV)  canonical 
form  {Ap(t),  Bp{t)}  associated  with  (3  is  given  by 
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(3.1) 


-^11  (^)  -^12  (^)  •••  -A-uit) 

•■■  A2i{t) 

Ai\{t)  -4/2  (t)  Aiiit) 

BpW  =  [V  I  K  I  •••  I  M  (3.2) 

where 

0  1  0  0 

0  0  1  0 

Aii{t)=  \  \  \  : 

0  0  0  •••  1 

+  l  ^i,di-i+2  ^i,di-\+3  '''  ^i,di 

0  0  0  •••  0 

0  0  0  0 
Aik{t)=  \  \  \  \ 

0  0  0  •••  0 

__OLi4k-i+'i-  <^i.4-i+2  ctt.d*-i+3  •••  <^i4k 

and  bpi^  =  Cd^,  where  e/  is  the  ith  standard  basis  for  M". 

If  a  pair  {A{t),B{t)}  is  u.c.c.  with  a  lexicographic  Lyapunov  basis,  then  it  can  be  reduced 
to  the  MVPV  canonical  form  {-4p(t),  Bp(t)}  by  a  Lyapunov  transformation.  Control  synthesis 
can  then  be  done  in  the  MVPV  canonical  form  by  state  feedback.  The  Lyapunov  transformation 
matrix  can  be  obtained  by  an  algorithm  developed  by  Silverman  (1965)  for  single  input  LTV 
systems,  and  by  Wolovich  [13]  (1968),  (see  also  [15]  by  Seal  and  Stubberud,  1969),  for  multi¬ 
input  LTV  systems. 

The  main  result  of  this  section  extends  the  Silverman-Wolovich  (S-W)  transformation  to  the 
entire  class  of  u.c.c.  MV  LTV  systems.  For  u.c.c.  systems  without  a  lexicographic  Lyapunov 
basis,  an  input  coordinate  transformation  is  applied  so  that  a  lexicographic  Lyapunov  basis  is 
obtained  for  the  transformed  inputs.  The  S-W  transformation  algorithm  is  modified  to  avoid  the 
inversion  of  the  lexicographic  Lyapunov  basis  matrix  P{t)  by  orthonormalizing  the  lexicographic 


Lyapunov  basis.  The  VPDO  and  the  adjoint  VPDO  notations  are  used  to  simplify  the 
representation  of  the  algorithm. 

Theorem  3.1. 

Let  the  MV  LTV  system  (1.10)  be  uniformly  completely  controllable.  Then  by  a  state 
coordinate  transformation 

x(t)  =  L{t)z{t) 

and  an  input  coordinate  transformation 

u{t)  =  T{t)v(t) 

the  MV  LTV  system  (1 .10)  can  be  reduced  to  the  MVPV  canonical  form 

z  ^  Ap(t)z  +  Bp{t)v  (2^s) 

y  =  G p{t)z  -y  Dp{t)v 

with  lexicographic  indices  n2, ...,  n/,  where 

Ap{t)  =  L-\t)VAm 
Bp{t)  =  L-\t)B{t)T{t) 

are  o/  the  form  (3.1)  and  (3.2).  and 

C, (i)  =  C(t)L{t) 

D, {t)  =  D{t)r\t) 


Proof  (Outline).  For  a  u.c.c.  pair  {A{t),  B{t)}  without  a  lexicographic  Lyapunov  basis,  first 
apply  the  elementaiy  column  operations  of  the  third  kind  on  the  controllability  matrix  C{t), 
namely  adding  to  a  column  another  column  multiplied  by  a  constant,  to  obtain  a  Lyapunov  basis. 

These  elementary  operations  can  be  facilitated  by  a  constant  coordinate  transformation  in  the 
input  space 

u(t)  =  Trit)vft) 
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Then  in  the  transformed  coordinates  the  pair  will  have  a  lexicographic  Lyapunov 

basis,  where 

B^it)  =  B{t)Ti(t)  =  [biit)  b2it)  •••  bi{t)] 

The  MVPV  canonical  form  {Ap{t),  Bp{t)}  can  now  be  obtained  from  {A(t),  Bi(t)}  by  a  state 
coordinate  transformation  x{t)  =  L(t)z(t)  and  an  input  coordinate  transformation 
vi(t)  =  T2{t)v{t)  as  follows. 

Let  P  =  {vjkit)  —  V^^'^b jit)}  be  an  orthonormal  lexicographic  Lyapunov  basis  for 
{A{t),  Bi(t)}  with  lexicographic  indices  n^,  n2, n/.  Then  the  state  coordinate  transformation 
matrix  L{t)  is  given  in  terms  of  J2(i)  =  L~^  (t)  by 


m  = 


Riit) 

Ri{t) 


Ri(t) . 


where  the  rii  x  n  submatrices  Ri{t)  are  given  by 


R,it)  = 


rl{t) 


where 

rik(t)  =  Q.'X-'VY'hm 

If  P  is  not  orthonormal,  let  Pit)  be  the  matrix  associated  with  /?  and  let  G{t)  be  the  left  (row) 
orthonormalization  matrix  for  P(t)such  that 

P{t)  =  Git)P{t) 


is  unitary,  and  let 


A{t)  =  G{t)VAG-Ht) 
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Bit)  =  G{t)B{t) 

Then  the  State  transformation  matrix  L(t)  can  be  written  as 

Lit)  =  L{t)G-\t) 

where  L{t)  is  obtained  from  the  orthonormalized  basis  represented  by  P  and  the  system 
as  described  above. 

The  input  coordinate  transformation  matrix  T^it)  is  the  right  (column)  orthonormalization 
matrix  for  R{t)Bi{t).  The  overall  input  coordinate  transformation  T{t)  is  given  by 
T{t)  =  T<,{t)Tiit).  □ 

A  MAPLE  implementation  of  the  Extended  S-W  transformation  has  been  developed.  It  will 
be  used  in  the  design  and  implementation  of  the  eigenstructure  assignment  control  in  Sections  5,  6 
and  7. 

4.  A  PD-Eigenvector  Based  Criteria  for  Uniform  Controllability  and  Observability 

In  this  section  PD-eigenvector  based  criteria  for  uniform  controllability  and  observability  are 
obtained.  These  results  are  important  in  their  own  right  because  they  are  natural  LTV 
counterparts  to  the  well  known  Popov-Belevitch-Hautus  eigenvector  tests  for  controllability  and 
observability  of  LTI  systems  [14],  They  will  also  constitute  the  basis  for  output/state  decoupling 
by  PD-eigenstructure  assignment.  The  first  main  result  of  this  section  gives  a  necessary  and 
sufficient  condition  on  the  pointwise  controllability  and  observability  using  PD-eigenvectors. 

Theorem  4.1  (PD-eigenvector  criteria  for  controllability  and  observability). 

A  MV  LTV  system  {Ait),  B{t),Cit)}  is  not  completely  controllable  at  ti  if  and  only  if 
there  is  a  row  PD-eigenvectors  g^(t)  of  A{t)  such  that  7(fi)  =  g^(fi)B(ti)  =  0  and 
t(^i)  0-  a  ts  not  completely  observable  at  ti  if  and  only  if  there  is  a  column  PD-eigenvector 

pit)  of  Ait)  such  thatriiti)  =  C(ti)p(ti)  =  0  andfi(ti)  =  0. 
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Proof.  The  proof  will  be  given  for  the  controllability  statement  only,  as  the  statement  for  the 
observability  can  be  proved  by  proving  controllability  for  the  adjoint  system.  Let  C(t) 
[Ci(t)  \  C2(t)  \  •••  1  Cn{t)]  be  the  controllability  matrix  for  {A{t),  where  Cfe(i)  = 

Suppose  that  q^{t)  is  a  row  PD-eigenvector  for  A{t)  associated  with  a  PD- 
eigenvalue  p(t)  such  that  'yi{ti)  =  q^{t\)B(ti)  =  =  0  and  72(ii)  =  0.  Let 

7fc(ii)  =  We  shall  first  show  by  induction  that  'Yk{h)  =  p(ti)7k-i{h)  =  0  and 

7fc(ii)  =  0,k  =  2,3,...  ,n.  To  this  end,  note  that  'Tk-iih)  =  0  implies  that  -q^iti)Ck-i{h) 
=  Q^{h)Ck-i{ti)-  It  then  follows  from  the  induction  hypothesis  that 

7fc(ti)  =  0^  ih)Ckiti) 

=  q\U)VACk-i{U) 

=  q\tx)A{h)Ck-,{h)  -  q'^iU)Ck-i{h) 

—  [q^  {h)A(ti)  +  q'^  {t-i)]Ck-i{ti) 

=  p{h)q'^  {ti)Ck-i{ti) 

=  Pitlhk-i(ti) 

=  0 


Moreover 


7fc(ti)  =  p{U)lk-i{ti)  +p(ti)7/t-i(ti)  —  0 


Consequently,  q'^(ti)C{ti)  =  0.  Since  q^{ti)  7^  0,  rank  C{ti)  <  n.  Thus  {A{t),  B{t)}  is  not 
completely  controllable  at  ti. 

Conversely,  suppose  that  rank  C{ti)  —  r<n,  so  that  {A(t),B(t)}  is  not  completely 
controllable  at  ti.  Then  there  exists  a  Lyapunov  basis  F  with  respect  to  which  {.A(t),  B(t)}  has 
a  representation  {A(t),  B{t)}  such  that 


.4(ti) 


-All(il) 

Aioih) 

0 

A22  ih) 

B{U) 


Bi(ti) 

0 


where  An(ti)  is  completely  controllable,  A22{ti)  s  =  n-r,  is  completely 

uncontrollable,  and  Bi{ti)  e  is  nonzero.  Let  p{t)  be  a  PD-eigenvalue  for  A22{t)  with  an 
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associated  row  PD-eigenvector  Let  f(t)  =  q'^it)L{t)  =  [o  |  where  L{t)  is  the 

coordinate  transformation  matrix  from  the  standard  basis  to  F.  Clearly  7(^1)  =  = 

=  0,  and  7(^1)  =  0.  It  remains  to  show  that  p{t)  is  a  PD-eigenvalue 
for  A{t)  and  q'^{t)  is  a  row  PD-eigenvector  for  A{t).  Clearly,  q'^it)  satisfies 

=  [0  I  -z'^it)] 

=  [0  I  z'^{t)[A22it)-  p{t) Is]] 

=  f(t)A(t)-p{t)fit) 

By  Remark  2  to  Definition  2.1,  z{t)  is  a  column  PD-eigenvector  for  —  A22(t).  Thus,  there  exists 
a  Lyapunov  basis  {Ci(i),  C2(^),  •  •  • ,  Cn(i)}  for  W  such  that 

1=1 

Now  construct  a  Lyapunov  basis  {/3i  (t),  02  (t),...,  for  R”  as  follows 

^  Zit)  ’  *  ^  1 

==  6j-s  .  j  =  s  -f  1, . . . ,  n 


where  is  the  fcth  standard  basis  vector  for  R'^  and  7/^0  (i)  is  chosen  by 

V-oW  =  [sr'(i)]s;-. 

j=s+l 

Then 

?(‘)=E[sr'(i)]Aw 

1=1 

It  then  follows  from  Remark  2  to  Definition  2.1  that  p(t)  is  a  PD-eigenvalue  for  A(t)  and  /(t) 
is  a  row  PD-eigenvector  for  A(t).  □ 
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In  the  following  corollary  to  Theorem  4.1,  a  MV  LTV  system  {A{t),  B(t),C(t)}  is  said  to 
be  uniformly  uncontrollable  if  the  controllability  matrix  C{t)  for  B(t)}  satisfies 

rank  C(t)  <  n,  Vt  >  0.  It  is  said  to  be  uniformly  unobservable  if  the  observability  matrix  0{t) 
for  {A(t),  C{t)}  satisfies  rank  0(t)  <  n,  Vi  >  0. 

Corollary  4.1. 

A  MV  LTV  system  {A(t),  B(t),  C{t)}  is  uniformly  uncontrollable  if  and  only  if  A{t)  has  a 
row  PD-eigenvector  q^{t)  of  A{t)  such  that  q^{t)B{t)  =  0.  It  is  uniformly  unobservable  if  and 
only  if  A{t)  has  a  column  PD-eigenvector  p(t)  such  that  C(t)p{t)  =  0.  □ 

In  the  following  corollary  to  Theorem  4.1,  a  MV  LTV  system  {A(t),  B(t),  C(t)}  is  said  to 
be  uniformly  r-controllable  if  the  controllability  matrix  C{t)  for  {A{t),  B{t)}  satisfies  rank 
C(i)  =  r.  It  is  said  to  be  uniformly  r -observable  if  the  observability  matrix  0{t)  for 
{A{t),  C{t)}  satisfies  rank  C>(i)  =  r. 

Corollary  4.2. 

A  MV  LTV  system  {A{t),  B{t),  C'(i)}  is  uniformly  r-controllable  if  and  only  if  there  exist 
exactly  s  =  n  —  r  row  PD-eigenvectors  qj (i)  such  that  qf  {t)B{t)  =  0,  ?  =  1, 2, . . . ,  s.  It  is 
uniformly  r-observable  if  and  only  if  there  exist  exactly  s  —  n  —  r  column  PD-eigenvector  pj{t) 
such  that  C(t)pj{t)  =  0,  j  =  1, 2, . . . ,  s.  □ 

The  following  theorem  gives  PD-eigenvector  based  criteria  for  modal  controllability  and 
observability.  It  sheds  some  light  on  how  the  orientation  of  a  PD-eigenvectors  affects  the 
“degree”  of  controllability  and  observability.  The  result  on  modal  observability  will  be  used 
subsequently  in  Section  6  to  develop  the  output/state  decoupling  technique  by  PD-eigenstructure 
assignment. 
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Theorem  4.2  (Modal  controllability  and  observability). 

Let  {A(t),B{t),C{t)}  be  a  MV  LTV  system  and  let  bj(t)  and  cj(i)  he  the  jth  column 
vector  of  B(t)  and  the  ith  row  vector  of  C{t),  respectively.  If  qj{t)  is  a  row  PD-eigenvector  of 
A(t)  associated  with  the  ith  PD-eigenvalue  pft)  such  that  qj{t)bj{t)  =  0,  then  the  associated 
ith  mode  expfgPi(T)dT  cannot  be  altered  by  the  jth  input  Uj(t)  with  the  state  feedback  control 
law  Uj{t)  =  kj{t)x{t)  for  any  gain  fcj(t).  Similarly,  If  pj(t)  is  a  column  PD-eigenvector  of 
A{t)  associated  with  the  jth  PD-eigenvalue  pft)  such  that  cj{t)pj(t)  =  0,  then  the  associated 
jth  mode  expfQpj(T)dr  is  not  observable  from  the  ith  output  yi(t)  with  a  state  observer  for  any 
observer  gain  hj(t). 

Proof  of  Theorem  4.2.  The  proof  will  be  given  only  for  the  controllability  statement.  The 
statement  for  observability  can  be  proved  by  applying  the  proof  for  controllability  to  the  adjoint 
system.  Let  be  a  PD-spectrum  for  A{t)  and  let  Q{t)  =  [gi(t)  |  g2(i)  |  -.•  | 

be  the  associated  row  PD-modal  matrix  consisting  of  the  row  PD-eigenvectors  qj{t).  Then  the 
coordinate  transformation  z{t)  =  Q(t)x{t)  results  in  i  =  Aft)z  +  Bft)u,  where 

A^{t)  =  Q{t)QAQ-\t) 

=  diag[pi(f),  p2{t),  ...,pn{i)] 

Bft)  =  Q{t)B{t) 

Now  suppose  that  qj{t)bft)  =  0  and  let  uft)  =  k]x{t)  =  k]Q-\t)z{t).  Then 

=  Pi{t)Zi{t) 

Clearly,  the  ith  mode  xfi)  =  exp/gVi(T)dr  can  not  be  altered  by  the  jth  input  uft)  =  klx{t) 
foranyfcj(f).  □ 
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Remarks. 

1.  Although  the  proof  for  the  observability  statement  in  Theorem  4.2  is  omitted,  it  is  instructive 

to  note  that  yS)  =  cj{t)x{t).  Note  let  P{t)  =  [pi(t)  \  Mt)  \  ...  |  Pn{t)]  be  the  column 
PD-modal  matrix  associated  with  a  PD-spectrum  Then  the  coordinate  transform¬ 
ation  x{t)  =  P(t)z(t)  results  mz  =  A,(t)z  -h  y{t)  =  C,{t)z,  where 

Mt)  =  p-\t)VAP{t) 

=  diag[pi(t),  p2{t),  ...,  pn{t)] 

B, {t)  =  p-\t)B{t) 

C, {t)  =  C{t)P{t) 

Thus,  the  ith  output  yi(t)  is  given  by 

j=l 

Clearly,  if  cj(t)pj{t)  =  0,  the  jth  forced  mode 

Zj{t)  =  +  [  e~^‘^^''^‘^^°^Y]bik{T)uk{T)dT 

Jo  k^l 

will  be  absent  from  yi{t).  Consequently,  the  ;th  free  mode  exp  f^pj(r)dT  is  not  observable 
from  yi(t). 

2.  The  above  arguments  constitute  the  basis  for  the  output/state  decoupling  by  PD- 
eigenstructure  assignment.  If  it  is  desirable  to  decouple  the  ith  output  yi(t)  from  the  jth 
closed-loop  mode  exp  foPjirjdr,  assign  the  jth  component  of  the  closed-loop  eigenstructure 
{pj{t),  pj{t)}  such  that  cj{t)pj{t)  =  0.  To  decouple  the  ith  state  from  the  ;th  closed-loop 
mode,  let  cj  (t)  =  ej,  where  is  the  tth  standard  basis  vector. 

5.  Stabilization  by  PD-Spectrum  Assignment 

In  this  section  the  PD-spectrum  base  stability  criterion  for  the  scalar  LTV  system  (1.1)  is  first 
extended  to  MV  LTV  systems  (1.0).  Then  a  result  is  presented  for  feedback  stabilization  of  MV 
LTV  systems  using  PD-spectrum  assignment  by  way  of  S-W  transformation.  The  design 
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procedure  for  PD-spectrum  assignment  is  also  provided.  The  necessary  and  sufficient  stability 
criterion  based  on  a  PD-spectrum  uses  an  extended-mean  concept  as  defined  below. 


Deflnition  5.1. 

Let  cr  :  7  R  be  a  locally  integrable  function  on  the  interval  I  =  [To,  oo).  The  extended 
mean  of  a{t)  over  I  is  defined  by 


2  fto+T 

=  limsup  -  a(r)dT 
T->oo,fo>To -i  Jto 


=  lim  sup  - 

lt>to+T,to>To  t- to 


Theorem  5.1. 

LetVA  be  a  VPDO  having  a  PD-spectrum  {pA-(f)}Li  Pk{t)\  <M,t>  0,  for  some 

M<  oo.  Let  pk{t)  and  ql(t)  be  a  column  PD-eigenvector  and  a  row  PD-eigenvector 
associated  with  pk{t)  respectively.  Then  the  null  solution  to  the  LTV  system  Vax  =  0  w 
uniformly  asymptotically  stable  for  all  to  >  To  if  and  only  if 

(i)  there  exists  aO  <Ck  <oo  such  that 

Pk{t))  =  -Ck<d 

and  moreover, 

(ii)  there  exist  hk>Q  and  0  <  dk  <  Ck  such  that 

\\Pk{t)ql{to)\\  < 

for  all  t  >  to  >  To.  □ 

Remarks. 

1.  Condition  (ii)  is  automatically  satisfied  if  the  imaginary  parts  of  all  PD-eigenvalues  are  of 
polynomial  order  or  slower;  that  is,  an  integer  m  >  0  exists  such  that 
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k  =  1,2,  ■■■,n 


t-+oo 


=  0, 


In  particular,  it  holds  if  Im  {pk{t)}  are  uniformly  bounded. 

2.  If  em^(Re Pk(t))  >  0  for  some  to  >  To,  and  1  <  fc  <  n,  then  the  null  solution  to  Va^  =  0 

is  unstable.  However,  if  em  (Repfc(i))  =  0  for  some  <o  ^  ^o.  and  1  <  k  <  n,  the  null 
solution  may  be  either  stable,  asymptotically  stable,  or  unstable,  but  it  cannot  be 
exponentially  stable. 


The  proof  of  Theorem  5.1  is  based  on  the  results  for  SPDO  presented  in  [16]  and  the  fact 
that  Lyapunov  transformations  preserve  stability.  Thus  is  is  omitted  here.  The  following 
Theorem  5.2  facilitates  stabilization  of  MV  LTV  systems  using  PD-spectrum  assignment  by  way 
of  Silverman-Wolovich  transformations. 


Theorem  5.2. 

Let  A(t)  =diag[Ai{t),  A2{t),  ...,  Ai{t)],  where  A' 6  are  hounded  companion 

matrices.  If  p{t)  is  a  PD-eigenvalue  of  Aft)  for  some  i  <  I  with  an  associated  column  PD- 
eigenvector  pft)  €  K"’,  then  it  is  a  PD-eigenvalue  for  A{t)  with  an  associated  column  PD- 
eigenvector  p(t)  generated from  pft). 

Proof.  Without  loss  of  generality,  suppose  p{t)  is  a  PD-eigenvalue  of  Aft)  with  an  associated 
PD-eigenvector  pft).  For  if  i  1,  a  constant  similarity  transformation  A{t)  =  L~^A(t)L  will 
swap  Aft)  and  Aft),  and  the  following  arguments  remain  valid  under  similarity  transformations. 
Now  to  show  that  p{t)  is  a  PD-eigenvalue  for  A{t),  we  need  to  construct  a  PD-eigenvector  p{t) 
from  Pi  {t)  satisfying 

p{t)  =  [A{t)  -  p{t)I]p(t)  (5.1) 

and 
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(5.2) 


Remark. 

Let  A{t)  be  a  block  upper  (lower)  triangular  matrix.  For  A{t)  =  A,  it  holds  that  if  A  is  an 
eigenvalue  of  a  block  matrix  Aait)  on  the  diagonal,  then  A  is  an  eigenvalue  of  A.  However,  this 
is  not  granted  for  a  time-varying  A{t). 

The  design  procedure  for  PD-spectrum  assignment  is  presented  below,  along  with  guidelines 
on  the  selection  of  closed-loop  PD-spectrum. 
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PD-Spectrum  Assignment  Procedure. 

1.  Transform  the  MV  LTV  system  (1.10)  into  MVPV  canonical  form  (3.3)  by  a  state 
coordinate  transformation  x(t)  =  L(t)z(t)  and  an  input  coordinate  transformation 
u(t)  =  T(t)v(t)  per  Theorem  (3.1). 

2.  For  each  block  companion  matrix  Aii(t)  in  the  MVPV  matrix  A(t),  chose  the  desired  PD- 
eigenvalues  and  synthesis  the  coefficients  of  the  SPDO  associated  with  Aii{t).  Then  design 
the  state  feedback  control  law  v{t)  —  K^{t)z{t)  to  obtain  the  desired  closed-loop  dynamics 
in  the  MVPV  coordinates. 

3.  The  actual  control  law  u(t)  =  K{t)x{t)  is  given  by  K{t)  =  T{t)Kp(t)L~^ (t) . 

Remarks. 

1.  For  BIBO  stability,  exponential  stability  must  be  achieved  by  assigning  negative  extended 
mean  to  all  the  PD-eigenvalues.  To  this  end,  it  suffices  to  keep  Re{pk{t)}  <  -e  <  0  for 
some  prescribed  e  >  0. 

2.  No  identical  PD-eigenvalues  should  be  assigned  within  each  companion  block  Aii{t).  For 
block  size  larger  than  2x2,  ensure  that  all  PD-eigenvalues  are  differentially  distinct. 

3.  If  a  pair  of  complex  conjugate  PD-eigenvalues  Pij{t)  =  a{t)  +  is  assigned,  keep  tu(t) 
from  vanishing. 

4.  The  PD-eigenvalues  should  be  continuously  differentiable  n  —  1  times. 

6.  Output/State  Decoupling  by  PD-Eigenstructure  Assignment 

In  this  section,  a  design  procedure  for  output  /  state  decoupling  using  PD-eigenstructure 
assignment  is  developed  based  on  Theorem  4.2.  According  to  that  theorem,  to  decouple  the  ith 
output  yi(t)  from  the  jth  closed-loop  mode  pft),  one  needs  only  to  assign  the  jth  column  PD- 
eigenvector  pj{t)  in  such  a  way  that  pj{t)  is  orthogonal  to  C7(t)  for  all  t.  A  natural  question  to 
ask  is  that  to  what  extent  the  orientation  pft)  can  be  adjusted.  The  following  theorem  addresses 
this  issue. 
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Theorem  6.1  (Parametrization  of  PD-eigenstructure) 

Let  {pk{t)}'k=\  ^  PD-spectrum  for  A(t)  with  an  associated  column  PD-modal  matrix 

Po(t).  Then  P(t)  =  PQ(t)S{t)  is  also  a  column  PD-modal  matrix  for  A(t)  if  and  only  if  S{t) 
satisfies 

S{t)=T{t)S{t)-S{t)T{t)  (6.1) 

with  detS{t)  f  0,  where  T(t)  =  diag[pi(t),  p^if),  ,  pnif)].  Moreover,  the  general  solution 
of  S  (t)  is  given  by 

S(t)  -  e^onr)dr He-Ionr)dr 


where  H  E  is  an  arbitrary  nonsingular  constant  matrix,  and 


sfo^(T)dT  _  gfoPiiT)dT^  gIoPi(.T)dT ^  ,  e 


/oPn(r)rfrl 


Proof.  Suppose  P{t)  =  Po(t)S{t)  is  a  PD-modal  matrix  for  A{t).  Then  we  have 

Poit)  =  A(t)Po(t)  -  Po{t)r{t)  (6.4) 

and 

[Poit)S{t)f^  =  A(t)[Po(i)5'(t)]  -  [Po(i)5'(0]r(i)  (6.5) 

But 

[Po{t)Sit)f^  =  Po{t)S{t)  +  Po{t)S{t)  (6,6) 

=  [A{t)Po{t)  -  Po{t)T{t)]S{t)  +  Poit)S{t) 

Since  detPo(i)  f  0,  it  follows  from  (6.5)  and  (6.6)  that 

S{f)  =  T{f)S{t)  -  SifpCf)  . 
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Conversely,  suppose  that  (6. 1)  holds.  Then 
=  Po(t)S(t)  +  Po(t)S(t) 

=  [A(t)Po(t)  -  Po(t)r(t)is(t)  +  Po(t)ir(t)s(t)  -  S(t)r(t)] 

=  A(t)mt)S(,t)]  -  [Po(i)S(<)lr(t) 

Thus  P{t)  =  PQ{t)S{t)  is  a  column  PD-modal  matrix.  The  general  solution  for  S{t)  given  in 
(6.2)  can  be  verified  by  direct  computation.  □ 

Remark. 

Theorem  6. 1  is  instrumental  in  the  output/state  decoupling  by  PD-eigenstructure  assignment, 
where  the  orientation  of  some  column  PD-eigenvectors  should  be  adjusted  so  as  to  keep 
orthogonal  to  a  row  vector  in  the  output  measurement  matrix  C(t).  Theorem  6.1  points  out  that 
the  orientation  of  a  column  eigenvector  is  determined  by:  (i)  the  constant  matrix  H,  and  (ii)  the 
associated  PD-spectrum  {Pk(t)}k=i-  Note  that  H  =  Pq“^(0)P(0)  is  fixed  at  t  =  0.  Thus  an 
optimal  H  should  be  computed  off-line  to  achieve  the  desired  PD-eigenvector  orientations  for  all 
permissible  PD-spectra.  Then  an  on-line,  real-time  optimization  is  needed  to  select  the  optimal 
PD-spectrum  to  achieve  the  best  tracking  performance  and  decoupling.  This  design  approach  is 
illustrated  in  Figures  6.1  and  6.2. 


f(0) 


Figure  6. 1  Output  Decoupling  by  PD-Eigenstructure  Assignment 
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Figure  6.2  PD-Eigenstructure  Assignment  Logic  for  Output  Decoupling 


PD-Eigenstructure  Assignment  Procedure: 

1.  Follow  the  PD-Eigenspectrum  Assignment  Procedure  of  Section  5  to  achieve  stability. 

2.  To  associate  the  ith  output  with,  and  only  with  the  jth  mode  of  the  closed-loop  system, 
assign  the  jth  PD-eigenvalue  pj{t)  such  that  the  ith  row  vector  cj{t)  of  C{t)  becomes  an 
associated  row  PD-eigenvector,  i.e. 

Q[A+BKipi)-pjI]Ci  =  0  (3) 

where  K{pj(t),  t)  =  Kp{pj(t),  t)L  ^(t).  In  particular,  if  Ci{t)  =  Ci  is  constant,  than  (3) 
can  be  written  as 

cJ[A(t)  +  B(t)Kipj(t).t)]c, 

3.  To  decouple  the  tth  output  from  the  jth  mode  of  the  closed-loop  system,  assign  pj{t)  such 
that  the  ith  row  vector  cj (t)  of  C7(t)  is  orthogonal  to  the  jth  column  PD-eigenvector  Pj(t), 
i.e. 


'P\A+BK(pj)-p^I]Pj  =  0 


(5) 


and 

(6) 


Remarks. 

1.  For  eigenstructure  assignment  in  a  LTI  system,  once  the  feedback  control  gain  K  is 
determined,  the  decoupling  condition  (6)  is  fixed,  because  the  column  eigenvector  pj  is  fixed 
up  to  a  constant  scaling  factor  which  does  not  change  the  orientation  of  pj.  Whereas  for  PD- 
eigenstructure  assignment,  condition  (6)  can  be  optimized  by  selecting  an  optimal  initial 
condition  Pj(0)  for  (5)  and  a  permissible  Pj(t). 

2.  For  eigenstructure  assignment  in  a  LTI  system  with  I  control  inputs,  only  n  of  the  n  x  Z  gains 
in  the  feedback  gain  matrix  K  are  need  for  eigenvalue  assignment,  the  rest  n  x  (Z  —  1)  gains 
can  be  used  to  alter  the  orientations  of  the  eigenvectors  for  optimal  decoupling.  However,  at 
the  present  time,  the  eigenvalue  assignment  based  on  Theorem  5.2  requires  the  use  of  all 
n  X  Z  gains,  leaving  no  freedom  in  eigenvector  assignment.  However,  this  is  not  an  intrinsic 
limitation.  Additional  research  is  needed  to  alleviate  this  limitation. 

7.  Roll-Yaw  Decoupled  Autopilot  for  a  BTT  Missile 

In  this  section,  the  results  obtained  in  Sections  2-6  are  applied  to  the  autopilot  design  for 
command  tracking  and  roll-yaw  decoupling  of  the  EMRAAT  bank-to-turn  missile  airframe.  The 
complete  nonlinear  state  equation  of  the  EMRAAT  airframe  is  given  in  the  Appendix.  For 
simplicity,  it  is  assumed  here  that  the  roll-yaw  dynamics  are  sufficiently  decoupled  from  the  pitch 
dynamics. 

The  basic  design  approach  is  by  linearization  of  the  nonlinear  roll-yaw  airframe  along  a 
nominal  trajectory  as  follows. 
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Let  the  system  states  and  inputs  be  chosen  as 


where  r{t),  p(t),  ^{t)  are  the  sideslip,  raw  rate,  roll  rate  and  roll  angle,  respectively,  6r(t) 
and  6p(t)  are  the  rudder  and  aileron  angles,  respectively.  Since  all  state  variables  are  on-line 
measurable,  they  are  defined  as  the  outputs.  Thus,  the  measurement  matrix  C(t)  =  /.  The 
design  objectives  are  to  decouple  /3(t)  from  p(t)  while  maintaining  good  command  tracking 
performance. 

Then  the  non-linear  state  equation  is  given  by 

jt—fft:  _  /2(^1)  ^2)  ^3)  ^4)  ^p) 

/3(6.^2,C3.^4,5r,5p) 

./4(6,6.6.^4,5r)^p). 

For  a  nominal  trajectory  ^satisfying 

define  the  tracking  errors  by 

x(i)  = 

and  the  tracking  error  control  input  by 

«(f)  =  6(t)  -lit) 


Then  the  linearized  tracking  error  dynamics  are  given  by 

X  =  Ait)x  +  Bit)u 
y  —  Cit)x  -f  Dit)u 
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For  the  EMRAAT  airframe  at  altitude  of  30, 000  ft.  and  Mach  of  2.0  [17] 

oii(i)  012(0  oi3(t)  014(0 

“22(0  023(0  0 

~  "  1001  032  (t)  033(0  0 

[0  0  1  0  _ 

■  1.979  •  lO"^cos(;0)  -  247.3  ■  lO-®cos(;0)  ‘ 

_  -75.99  17.52 

-959.5  -1244 

0  0 

’1  0  0  O' 

“0010 
0  0  0  1 

D{t)  -  0 

where 

an(t)  =  ^{Cy, cos(;0(t))  -  (C'y,^(t)  +  Cy^m  +  Cyjit)  + 

+  CY,/p{t)  +  CYjrit))sm(^{t))}  -  :^sin(0(O)sin(^(O) 

ai2(0  =  -  1  +  398.21  •  10“®cos(^(t)) 

013(0  =  -  18-056  •  lO~®cos(;0(t)) 
ai4(t)  =  16.631  •  lO"^cos(0(O)cos(^(O) 

022(0  =  0.3623  •  10"V(0  +  0.5024  •  10"^p(0  -  0.6053 
023(0  =  7-8  •  10“^P(0  +  0.5024  •  10"V(0  -  3.5  •  10"^ 

032(0  =  -  36.8  •  10~V(0  -  2.3  •  10"^p(0  +  0.8056 
033(t)  =  35.80  •  10"®  p(0  -  2.3  •  10"®  r  (0  -  2.177 

The  overall  system  configuration  is  shown  in  Figure  7. 1 .  The  PD-eigenstructure  assignment 
control  for  LTV  error  dynamics  stabilization  and  decoupling  is  shown  in  Figure  7.2. 
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Figure  7.1  Nonlinear  Tracking  System  Configuration 


Figure  7.2  LTV  Tracking  Error  Stabilizing  and  Decoupling  Control 
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Now  apply  the  S-W  transformation  to  obtain  the  MVPV  canonical  from  with  lexicographic 
indices  ni  =  2, 722  =  2; 


0 

1 

0 

0 

021  (i) 

O22  (t) 

023  (i) 

024  (i) 

0 

0 

0 

1 

04l(i) 

042(i) 

043(i) 

044(0 

Bp{t) 


0  0 
1  0 
0  0 
0  1 


Then  design  the  state  feedback  control  gain 


/%l(0  -  021  (0  /%2(0  -  022(0  ' 

-023(0  —023(0 

-041(0  -042(0 

A3  (0  -  043  (0  A4  (0  -  044  (0 

to  obtain  the  desired  closed-loop  dynamics  in  the  MVPV  coordinates 


0 

1 

0 

0 

A,{t)  +  = 

Ai(0 

A2(0 

0 

0 

0 

0 

0 

1 

0 

0 

A3(0 

A4(0 

where  synthesized  from  the  (real)  PD-spectral  canonical  form 


ai(0 

t^i(0 

0 

0 

(7i(0 

0 

0 

0 

0 

<721  (0 

0 

0 

0 

0 

<722(0 

with  the  desired  closed-loop  PD-eigenvalues  /?ii,i2(i)  =  cri(i)  ±  p2i{t)  = 

p22{t)  =  o'22{t),  where  the  roll  modes  are  chosen  to  be  simple  (nonoscillatory). 

Due  to  the  time  constraints  on  this  research,  implementation  and  simulation  results  are  not 
available.  Anticipated  difficulties  of  the  implementation  are  (i)  the  complexity  of  the  control  law 
kp(t),  (ii)  training  of  the  neural  network  inverse  of  the  unstable  nonlinear  airframe,  and  (Hi)  real- 
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time  implementation  of  the  eigenstructure  assignment  logic.  However,  none  of  these  problems  is 
insurmountable  with  further  studies. 

8.  Summary  and  Conclusions 

In  this  research  a  systematic  design  procedure  for  nonlinear  tracking  and  output/state 
decoupling  control  has  been  developed  by  way  of  linearization  along  a  nominal  trajectory.  The 
resulting  LTV  tracking  error  dynamics  are  then  stabilized  and  decoupled  using  PD-eigenstructure 
assignment  in  a  way  similar  to  the  eigenstructure  assignment  design  for  LTI  systems.  Theoretical 
results  on  the  assignability  of  the  PD-eigenstructure  for  stabilization  and  decoupling  have  been 
obtained.  However,  due  to  the  time  constraint,  implementation  and  simulation  results  are  not 
available  at  the  present.  Further  research  is  needed  to  address  the  complexity  of  the 
implementation,  and  to  validation  the  theory  and  design  procedure  by  simulations. 

It  is  believed  that  this  research  is  the  first  attempt  at  stabilization  and  decoupling  of  time- 
varying  dynamics  using  a  differential-algebraic  approach,  without  resorting  to  the  unreliable 
frozen-state,  frozen-time  gain  scheduling.  Preliminary  results  have  shown  excellent  performance 
and  robustness  with  a  simple  pitch  autopilot.  Thus,  further  study  on  the  implementation  of  the 
significantly  more  complex  is  warranted.  Furthermore,  the  PD-eigenstructure  assignment  control 
is  by  no  means  limited  to  stabilization  and  decoupling.  An  array  of  challenges  posed  by  modem 
missile  technology  [2]  can  be  addressed  by  the  multiobjective  PD-eigenstructure  assignment 
concept  illustrated  in  Figure  8.1.  Additional  applications  can  be  found  is  aircraft  flight  control, 
spacecraft  altitude  control,  vibration  control,  robotics  and  other  nonlinear  tracking  and  fast  time- 
varying  control  systems. 
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Figure  8. 1  Multiobjective  PD-Eigenstructure  (PDES)  Assignment  Control 


Appendix 


State  Equations  for  the  EMRAAT  Missile  [17] 

a  =  g  -  tan(/3)[p(cos(Q)  -  rsin(a)]  +  — (cos(a)cos(i?!.)cos(6>)  +  sin(a)sin(6')) 

sQS 

Wcos(/3) 

is  =  psin(a)  -  rcos(a)  +  ^{Cy^lS  +  Cy^p  +  Cyr  +  Cy^S^  +  CyJr)cos{(S) 

+  ^cos{O)sin(4>)cos{0) 

P  [(  ^xyJ-xzIzz  ^xz^y2  “t"  ^xy^yz  Ixylxzlyy')j^  + 

(^^yy^yz^zz  lyz  ^xy^yz  ^xy^xz^yy^Q  “1“ 
i~^yy^yzlzz  +  Ixylxz^zz  +  lyz  +  ^xz^yz)^'^  + 

+  {—Ixylyzhz  +  Ixzlyyhz  “  “^^Izhz  —  Ixylyylyz  +  Ixxhylyz  “  hz^ly  +  Ixxlxzlyy)pq  + 

+  (Ixylzz  +  Ixzlyzhz  “  Ixylyyhz  —  Ixxlxyhz  +  ’^hyl^z  +  Ixzlyylyz  —  Ixxlxzlyz)pr  + 

■I"  i~^yy^zz  +  ^yz^zz  +  ^lylzz  ~  lyyiyz  ~  ^xz^yy)^'^  + 

+  QSd{Cl^{IyyIzz  ~  lyz)  ^  ^zipi^xy^yz  +  Ixzlyy))p  + 

+  QSd{Cm^{IxyIzz  +  Ixzlyz))q  + 

+  QSd{ClXlyyhz  —  I^z)  +  Cry.{lxylyz  +  Ixzlyy))^  + 

+  QSd{CjnXlxyIzz  +  Ixzlyz))di  + 

+  QSdiPmS^xyhz  +  IxzIyz))oi  + 

+  QSd{Clp{IyyIzz  —  I^z)  +  C^npilxylyz  +  IxzIyy))P  + 

+  QSd{Clf^{IyyIzz  —  lyz)  C„^{lz:ylyz  +  Ixzlyy))dp  + 

+  QSd{Cms^{IxyIzz  +  Ixzlyz))^q  + 

+  QSd{Cl^{IyyIzz  —  lyz)  4"  Cni^{lxylyz  +  Ixzlyy))5r\ 

+  {Ixxlyyhz  ~  I^ylzz  ~  ^xx^lz  ~  ^^xylxzlyz  ~  ^xz^yy)~^ 
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Q  —  [(  Ixxlxzizz  Iix^xylyz  ^xy  ^xy^xz)P  "t” 

+  ijxylyzlzz  "I"  lyz^xz  Ixx^xy^yz  ^xy^xz^Q 
"i"  (  Ixy^yzlzz  ■)"  Ixi^xz^zz  ^xz^yz  ^xz)^ 

+  (^~lxx^yzlzz  "I"  Ixy^xx^zz  +  “^^xz^yz  ~  ^xx^yylyz  ~  Ixylxzlyy  +  lyz^x 
+  {Ixxl'^z  ~~  ^xz^zz  ~  Ixy^zz  ~  ^xx^zz  +  Ixxlyz  +  ^xz^xx)P>'  + 


+  (  ^xy^zz  ^xz^yz^zz  "I"  ^xy^yy^zz  +  ^xxlxylzz  “i"  ^xz^yy^yz  ^xz^xx- 

+  QSd(ClJJ[xyIzz  Ixzlyz^  "I"  CnJi,Ixx^yz  ^xyJ^xz))P  "t* 


+  QSd{Cm^{IxxIzz  ~  ^xz))‘l  + 

+  QSd{Cl^{IxyIzz  +  Ixz^yz)  +  Cn^{lxxlyz  +  Ixy^xz))'!'  + 
+  QSd{CmSlxxIzz-llz))^  + 

+  QSd{CmS^xxhz  —  llz))^  + 

+  QSd{Clp{IxyIzz  ^xzlyz)  CnpiJxxJyz  "I"  Ixz^xy)) P  "I" 
+  QSd(Cl^^{IxyIzz  “)■  ^xz^yz)  Cn^{lxxlyz  "t"  Ixz^xy))^p 
4-  QSd{Cms^(IxxIzz  —  Ixz))^q  + 

+  QSd{Cl^{IxyIzz  +  Ixzlyz')  +  Cji^iJxxIyz  + 


{IxxJyy^zz 


T"^  T  —  J  -OT  T  J 
■‘■xy^zz  ZxxZyz  ^ZxyZxzZyz 


/2  T  'v-l 

^xz^yy) 


r  =  \{~IxxIxzJyz  ^xxlxy^yy  ~  Ixy  ~  ^xz^xy)P  "i" 
+  {Jxz^yy^yz  +  ^yz^xy  ~  Ixx^xylyy  +  ^xy)‘l 

+  {  —  Ixylyz  ~  Ixz^yylyz  +  ^xx^xz^yz  +  Ixylxz)^ 


+  {—^xxdyz  ~  ^ly^xx  +  llzlyy  +  TtjJw  + 


7-2 

^xyZyy 


2 

yy^xx 


iLixx)pq  + 


^xy-^ 


+  {Jxx^yxlzz  "t"  Ixy^xz^zz  +  ^xxlyy^yz  “^^xy^yz  ^yz^xx  ^zylxzlyy 


+  (  Ixylyz  I zz 


Ixzlyylzz 


+  Ixulmljiz  +  Ixxlxylyz  +  Ixzlyy  Ixxlxzly] 


ZxxZxzZyy 


+  QS d(Cl^[IxyIyz  Ixzlyy)  +  Cn^ijxxlyy  Ixy))P 
+  QSd{Cm,{IxxIyz  +  Ixylxz))q  + 

+  QSd{Cl^{IxyIyz  +  Ixzlyy)  +  Cnr{lxxlyy  —  Ixy))’’’  + 
+  QSd{Cmi,{IxxIyz  —  IxyIxz))o  + 

+  QSd{Cm^{IxxIyz  —  Ixylxz))<^  + 

+  QS d{Cl^{IxyIyz  +  Ixzlyy)  +  Cnp^lixlyy  ~  Ixy))P  "I" 


+  QSd(Cl^{IxyIyz  +  Ixzlyy)  +  Cn^{lxxlyy  Ixy))^P  "t" 
+  QSd{Cmi^{IxxIyz  —  Ixylxz))dq  + 


+  QSd[Clz^{IxyIyz  +  Ixzlyy)  +  Cni^{lxxlyy  •^iy))^r] 


+  {Ixxlyylzz  ^xy^xz  Ixxl^ 


yz 


'  llxylxzlyz 


—  I  '1“^ 

^xz^yy) 


c  IxxIxylxz^Pq 

yz  ~  ‘IIxz^xy')'P' 


IxxIxyIxz')P^  4" 

+  1llylxz)qr  + 
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Glossary  of  Terms 

a  —  Angle  of  attack 
/3  —  Angle  of  sideslip 
p  —  Roll  rate 
q  —  Pitch  rate 
r  —  Yaw  rate 
Q  —  Dynamic  Pressure 
S  —  Reference  area 
d  —  Reference  length/diameter 

V  —  Missile  velocity 
W  —  Missile  weight 

9  —  Acceleration  due  to  gravity 
■0  —  Yaw  angle 
0  —  Pitch  angle 

^  —  Roll  angle 

5p  —  Roll  control  input  (surface  deflection) 
6q  —  Pitch  control  input  (surface  deflection) 
<5r  —  Yaw  control  input  (surface  deflection) 

Q  —{gQS/wv) 

N  —  Normal  force 

Y  —  Side  force 

Cat,  —  Aerodynamic  Coefficient-  a  due  to  b 
I  —  Aerodynamic  moment  about  s-axis 
m  —  Aerodynamic  moment  about  y-axis 
n  —  Aerodynamic  moment  about  z-axis 
lij  —  Moment  or  product  of  inertia 


Aerodynamic  coefficients  (Mach  =  2.0) 

CNa  =  36.6 
Cn^  =  0.0274 
Cm,  =  0.0145 
Cns,  =  6.0165 
Cy0  =  -14.9 
Cyp  =  -0.00073 
Cy;  =  0.0161 
=  -  01 
Cy^  =  .08 
Cip  =  5.44 
Ci^  =  -0.011 
Ci^  =  0.0021 
Ci^  =  -6.30 
Ci^  =  -5.16 

Cm,  =  -82 

Cmt,  =  -0.014 
Cm,  =  -0.202 
Cm,,  =  -40.7 
Cnp  =  35:52 
Cn,  =  0.006 
Crv  =  -0-2 
Cn,^  =  1.72 
=  -28.65 


EMRAAT  Physical  Properties 

y  =  32.2ft/s2 
d  =  0.625  ft. 

S  =  0.3067 
W  =  227  lbs 
V  =  1936.16  ft/s 
M  =  2.0 

Q  =  1100.75  lb/ft2 
Ixx  =  1.08  slug*ft^ 

Ipp  =  70.13  slug*ft2 
Ixx  =  70.66  Slug*ft2 
Ixp  =  0.274  slug*ft2 
=  -0.704  slug+ft2 
Ipx  =  0.017  slug*ft2 
Air  Density  =  5.87  x  lO”"*  slug/ft^ 

Altitude  =  30, 000  ft. 
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DESIGN  AND  IMPLEMENTATION  OF  A  GLOBAL  NAVIGATION  SATELLITE  SYSTEM  (GNSS) 

SOFTWARE  RADIO  RECEIVER 
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Ph.D.  Candidate 
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Abstract 

A  prototype  Global  Navigation  Satellite  System  (GNSS)  software  radio  has  been  successfully  developed.  A 
software  radio  has  many  advantages  over  the  architecture  of  a  traditional  receiver.  These  include  a  tighter 
integration  between  simulation  and  implementation,  a  tremendous  level  of  versatility  in  the  final  design,  and  the 
ability  for  a  single  receiver  to  function  as  multiple  receivers.  The  focus  of  this  implementation,  a  GNSS  receiver,  is 
a  navigation  receiver  and  will  bring  all  the  benefits  of  the  software  radio  to  the  navigation  community.  The 
preliminary  work  accomplished  in  the  development  of  the  GNSS  software  radio  thus  far  is  the  implementation  of  the 
receiver  firont  end,  data  collection  hardware,  and  signal  processing  algorithms.  This  work  has  resulted  in  a 
postprocessed  position  solution  within  500  meter  solely  through  the  use  of  software-based  signal  processing. 
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DEVELOPMENT  OF  A  GLOBAL  NAVIGATION  SATELLITE  SYSTEM 

SOFTWARE  RADIO 


Dennis  M.  Akos 


Introduction 

The  software  radio  describes  a  receiver  in  which  the  majority  of  the  signal  processing  is  accon^)lished  via  a 
programmable  microprocessor  as  opposed  to  analog  or  hardwired  discrete  components.  This  allows  for  a  tighter 
integration  of  simulation  and  implementation  as  well  as  tremendous  flexibility  in  the  final  design. 

The  software  radio  concept  is  being  applied  in  the  design  of  a  GNSS  receiver.  However,  this  concept  is  not 
limited  to  the  GNSS  signal  and  could  be  expanded  to  included  other  navigation  signals  in  the  same  radio  design. 
This  initial  work  will  bring  the  benefits  of  such  an  implementation  to  the  navigation  community. 

The  paper  begins  by  describing  the  ideal  software  radio  and  details  its  multiple  benefits.  The  target 
implementation  and  the  development  testbed  is  characterized  along  with  the  necessary  design  steps.  Finally,  an 
informal  discussion  of  various  GNSS  acquisition  and  tracking  methodologies  implemented  is  presented.  Described 
here  are  those  techniques  implemented  for  use  with  the  software  radio  and  validated  using  actual  GNSS  data. 

Software  Radio 

There  are  two  primary  design  goals  in  developing  a  software  radio.  First,  the  analog-to-digital  converter 
(ADC)  should  be  positioned  as  close  to  the  antenna  as  possible  in  the  front  end  of  the  receiver.  Second,  the  resulting 
samples  should  be  processed  using  a  programmable  microprocessor.  These  two  principles  provide  all  the  benefits 
associated  with  the  software  radio. 

Moving  the  ADC  closer  to  the  antenna  in  the  RF  front  end  chain  eliminates  additional  components  used  in 
frequency  translation.  These  components  include:  local  oscillators  (LO),  mixers,  and  filters,  all  of  which  can 
contribute  potential  nonlinear  eff^ects  as  well  as  temperature  and  age  based  performance  variations.  Ideally,  the 
receiver  front  end  would  consist  of  the  antenna,  amplifier,  bandpass  filter,  and  ADC.  Frequency  translation,  since  it 
is  impractical  to  process  the  signed  at  RF,  is  accomplished  via  bandpass  sampling  [1]. 

Bandpass  sampling  is  the  process  of  sampling  an  information  signal  based  on  its  bandwidth  as  opposed  to 
its  RF  carrier.  This  has  been  proposed  and  implemented  successfully  with  the  GPS-SPS  transmission  [2].  Reference 
2  details  a  front  end  design  consisting  of  an  antenna,  amplifiers,  filters  and  an  ADC  that  sampled  the  1575.42  MHz 
RF  carrier  directly  at  a  rate  of  5  MHz  and  achieved  the  desired  frequency  translation  via  bandpass  sampling. 

Processing  the  resulting  ADC  samples  strictly  in  software  provides  additional  benefits  for  the  software  radio 
concept.  First,  since  all  signal  processing  is  accomplished  in  software  there  is  tighter  integration  between  simulation 
and  actual  receiver  operation.  If  the  signal  degradations  can  be  adequately  simulated,  performance  can  be  accurately 
predicted  using  the  actual  discrete  signal  processing  that  will  occur  in  the  receiver.  This  is  especially  true  now  that 
the  front  end  contains  fewer  possible  error  sources.  Second,  there  is  a  tremendous  level  of  flexibility  in  the  receiver 
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design  since  all  signal  processing  is  software  based.  In  order  to  incorporate  the  latest  Aeoretical  developments, 
cosUy  hardware  prototypes  no  longer  need  to  be  fabricated,  rather  they  can  be  incorporated  into  the  programming 
and  evaluated.  Various  receiver  architectures  can  be  assessed  simply  by  downloading  the  appropriate  software  to  the 
target  processor. 

The  front  end  and  the  flexibility  of  achieving  all  signal  processing  in  software  allow  the  design  to  serve  as 
multiple  radios  [3].  Currently,  there  exist  receivers,  where  the  LO  is  adjusted  to  downconvert  and  process  different 
frequency  transmissions.  These  designs,  however,  are  limited  in  their  signal  processing  as  a  result  of  their  hardwired 
architectures.  With  the  software  radio  design  a  continuous  range  of  frequencies  could  be  captured  by  using  a  high 
sampling  rate.  Specific  transmissions  could  then  be  digitally  filtered  out  and  processed.  By  changing  the  software 
processing,  a  single  configuration  could  serve  as  an  FM,  AM,  or  PM  receiver.  This  concept  would  be  extremely 
beneficial  in  the  navigation  community  as  a  single  receiver  could  process  and  integrate  multiple  navigation  signals 
for  improved  accuracy,  reliability,  and  integrity. 

Some  of  these  ideas  are  reflected  in  the  current  software  radio  research.  Reference  4  discusses  a  GPS 
L1/L2  front  end  design  which  bandpass  samples  the  frequency  band  1.2  -  1.6  GHz  to  utilize  both  GPS  frequency 
transmissions.  This  front  end  is  followed  by  digital  filters  used  to  extract  the  exact  information  bands  of  interest  for 
further  processing. 

The  software  radio  concept  is  not  without  its  disadvantages,  unfortunately.  There  are  two  primary 
technological  factors  which  limit  its  current  practicality.  They  are  the  current  state-of-the-art  in  ADC  technology  and 
programmable  processing  power.  For  the  bulk  of  the  navigation  community,  the  highest  RF  signals  of  interest  are 
for  the  GNSS  band  which  are  below  1.7  GHz.  ADC’s  do  exist  which  can  provide  multi-bit  sampling  at  rates  up  to  4 
Gsps  [5].  At  this  sampling  rate  all  frequency  information  from  DC  to  2  GHz  can  be  captured.  However, 
programmable  processing  power  significandy  reduces  the  maximum  possible  data  rate  from  that  allowed  by  the  ADC 
[6].  This  leads  to  the  capture  of  partial  frequency  bands,  as  in  the  case  of  bandpass  sampling.  The  limits  imposed  by 
the  current  generation  of  programmable  processors  are  so  restrictive,  a  variation  of  bandpass  sampling  has  been 
proposed  for  the  combined  digitization  and  processing  of  GPS-SPS  and  GLONASS  signals  [7]. 

These  disadvantages  are  only  temporary.  ADC  performance  already  exceeds  what  is  required  for  the 
navigation  community.  The  lag  in  available  processor  power  should  be  eliminated  in  the  near  future.  Moore’s  law, 
which  has  held  true  since  the  inception  of  the  microprocessor,  has  shown  processing  power  to  double  every  18 
months. 

GNSS  SOFTWARE  RADIO  DEVELOPMENT 

A  navigation  software  radio  will  provide  significant  advantages  over  existing  receivers  and  their  traditional 
design.  Although  the  current  technology  will  not  allow  the  development  of  an  all-encompassing  navigation  software 
radio,  the  initial  goal  is  the  development  of  a  GNSS  software  radio.  As  technology  advances,  the  framework  (front 
end  hardware  and  software  algorithms)  will  be  in  place  to  take  advantage  of  increased  processing  power. 
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The  development  is  planned  for  three  stages.  First,  a  front  end  utilizing  multiband  bandpass  sampling  was 
designed  and  implemented  [7].  This  research  demonstrated  proof  of  concept  and  a  final  design  is  under 
development.  The  second  stage,  currently  underway,  is  the  programming  of  the  software  algorithms  necessary  for 
processing  of  the  sampled  data.  It  is  impractical  to  attempt  to  initially  develop  the  algorithms  to  operate  in  real  time. 
Rather  a  data  set  will  be  collected  and  postprocessed  using  the  developing  algorithms.  The  third  stage  will  be  the 
optimization  of  these  algorithms  to  operate  in  real  time.  The  target  processor  for  the  real  time  in^lementation  is  the 
Texas  Instruments  TMS320C80  DSP,  one  of  the  most  powerful  DPS  processors  available,  capable  of  2  BOPS. 

In  order  to  validate  the  spread  spectrum  acquisition  and  tracking  algorithms  it  is  necessary  to  obtain  an 
adequate  length  data  set  The  difficulty  lies  in  the  pre-correlation  bandwidth  of  the  GNSS  signal.  GPS-SPS  has  a 
null-to-null  bandwidth  of  approximately  2  MHz,  therefore  the  minimum  sampUng  frequency  must  be  at  least  4  MHz. 
A  sampling  frequency  of  5  MHz  is  used  to  adequately  capture  the  required  frequency  information.  Assuming  8  bit 
samples,  30  seconds  of  data  (a  full  navigation  frame  for  GPS-SPS)  requires  150  MB  of  storage  space.  In  order  to 
minimize  the  storage  requirement,  data  sets  of  12  seconds  (or  a  two  subframes)  are  collected  for  postprocessing.  If 
subframes  #1,  2,  and  3  can  be  collected,  the  resulting  data  wiU  contain  enough  information  to  establish  a  position 
solution,  which  is  the  primary  purpose  in  a  navigation  receiver.  This  can  be  accompUsh  by  collecting  a  single  data 
set  corresponding  to  the  desired  subframe  and  then  storing  that  on  a  hard  drive  or  an  alternative  long  term  storage 
device,  then  collecting  the  next  desired  subframe  soon  after.  This  process  is  repeated  until  all  required  subframes 
can  be  collected. 

The  data  collection  platform  uses  a  more  traditional  front  end  design  to  reduce  the  requirement  on  the  ADC. 
The  configuration,  depicted  in  Figure  1,  employs  a  single  downconversion  stage  to  21.25  MHz,  where  the  signal  is 
bandpass  sampled  at  5  MHz,  resulting  in  a  final  IF  of  1.25  MHz.  This  arrangement  is  utilized  to  collect  GPS-SPS 
data  sets  and  the  development  of  generic  GNSS  signal  processing  algorithms.  The  data  collection  hardware  is  a 
Peripheral  Component  Interface  (PCI)  card  for  use  with  Intel-based  microcomputers.  The  card  utilizes  two  ADCs 
which  allows  12-bit  sampling  at  rates  up  to  60  MHz,  more  than  adequate  for  the  2  MHz  null-to-null  bandwidth  of  the 
GPS-SPS  signal.  One  distinct  advantage  of  this  card  is  the  ability  to  use  the  PCI  bus  to  write  samples  directly  to  the 
memory  of  the  host  PC  as  opposed  to  using  expensive  memory  on  the  ADC  card  itself  In  this  configuration  it  is 


possible  to  store  a  continuous  data  record  of  up  to  128  MB,  the  maximum  memory  size  in  typical  motherboards. 
Multiple  data  records  of  this  length  allow  for  capture  of  sufficient  GNSS  navigation  data  to  solve  for  a  position 
solution. 


GNSS  SOFTWARE  RADIO  ACQUISITION 


The  first  stage  in  processing  the  code  division  multiple  access  (CDMA)  format  of  the  GPS-SPS  navigation 
signal  is  acquisition.  The  spread  spectrum  modulation  format  essentially  conceals  any  discernible  signal  in  the  raw 
data  set  when  viewed  in  either  the  time  or  frequency  domains.  This  is  also  true  for  GLONASS  even  though  it 
employs  fi-equency  division  multiple  access  (FDMA).  Each  fi-equency  channel  uses  the  same  maTimal  length  code  as 
a  spreading  sequence  for  the  purpose  of  time  transfer.  Raw  data  collected  using  the  front  end  in  Figure  1  is  shown  in 
Figure  2  in  both  the  time  and  frequency  domain.  The  rolloff  in  the  frequency  domain  plots  is  a  result  of  the  2.0  MHz 
3  dB  bandwidth  of  the  final  filter  in  the  RF  chain.  The  raw  GPS-SPS  data  contains  the  CDMA  broadcasts  of  5 
visible  satellites.  This  same  front  end  configuration  allowed  enough  bandwidth  for  data  capture  of  2  of  the 
GLONASS  channels.  GLONASS  data  was  collected  by  adjusting  the  LO  of  the  front  end  to  translate  channels  21  & 
22  to  the  resulting  sampled  bandwidth. 


Acquisition  is  the  search  for  the  parameters  necessary  to  identify  the  signal  and  begin  tracking.  In  the  case 
includes  the  signal  s  spreading  (Coarse/Acquisition  (C/A))  code,  carrier  frequency,  and  code 
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phase.  GLONASS  reduces  the  search  space  by  one  parameter  as  it  uses  the  same  spreading  sequence  on  each 
frequency  channel.  The  search  can  be  visualized  as  a  matrix  (2-D  for  GLONASS  and  3-D  for  GPS)  where  every 
entry  must  be  tested  until  one  is  found  corresponding  to  the  correct  set  of  parameters.  This  search  space  must  be 
bounded  with  a  defined  step  size.  For  GPS-SPS  there  are  32  possible  C/A  codes.  The  possible  carrier  frequency, 
which  differs  as  a  result  of  Doppler,  is  bounded  for  most  users  to  ±10  kHz  from  nominal  and  is  searched  in  500  Hz 
bins.  Lastly  the  spreading  code  is  1023  chips  for  GPS-SPS  and  511  chips  for  GLONASS  and  is  searched  in  Vi  and  14 
chip  increments,  respectively,  over  a  single  code  period. 

Th«e  are  a  number  of  popular  spread  spectrum  signal  acquisition  algorithms  [8].  However,  most 
commercial  GPS-SPS  receivers  tend  to  use  the  serial  search  technique.  The  popularity  of  this  technique  is  most 
likely  due  to  the  fact  diat  the  digital  correlator/accumulator  hardware  can  be  used  not  only  for  tracking,  but  also 
acquisition  if  serial  search  is  employed.  In  order  to  demonstrate  the  flexibility  of  the  software  radio  approach, 
multiple  acquisition  algOTithms  have  been  coded. 

The  postprocessing  approach  allowed  for  a  more  comprehensive  evaluation  of  each  of  the  acquisition 
techniques.  First,  the  code  phase  search  was  stepped  in  terms  of  samples.  Second,  the  search  conducted  was 
exhaustive,  that  is  every  point  in  the  search  space  was  evaluated.  In  a  traditional  receiver,  points  in  the  search  space 
are  sequentially  tested  until  a  threshold  is  crossed  indicating  a  potential  match  has  been  obtained  and  control  is 
transferred  to  attempt  tracking.  In  serial  search  there  are  well-defined  equations  to  calculate  the  threshold  that  also 
determines  the  associated  probabilities  (nussed  detection  and  false  acquisition)  [9]. 

In  the  standard  serial  search  routine,  the  signal  is  converted  to  baseband  using  a  frequency  entry  from  the 
test  matrix  and  multiplied  by  the  spreading  code  with  a  code  phase  entry  from  the  test  matrix.  The  resulting  data 
points  are  accumulated  over  a  single  code  period  and  that  measurement  is  used  to  determine  if  the  correct  entry  from 
the  matrix  has  been  found.  Although  this  is  a  well-established  technique,  the  disadvantage  is  that  all  test  points  in  the 
matrix  must  be  evaluated  serially,  as  implied  by  the  name.  From  the  earlier  discussion  exhaustive  testing  of  single 
C/A  code  or  GLONASS  frequency  will  require  evaluating: 
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possible  entries  in  the  test  matrix.  This  search  space  can  often  be  reduced  through  knowledge  of;  the  satellite 
almanac  data  and  current  user  position  and  time  estimates.  This  will  indicate  which  satellites/frequencies  should  be 
tested  first  as  an  attempt  to  reduce  acquisition  times,  but  will  not  improve  an  exhaustive  search  as  it  does  not 
eliminate  any  of  the  search  space  rather  it  provides  a  good  initial  estimate.  This  technique  has  been  implemented  and 
tested  successfully  in  software.  Results  on  the  collected  data  set  will  be  presented  following  the  discussion  of  all 
applied  acquisition  methods. 

Two  published  improvements  on  the  serial  search  technique  have  been  implemented  for  use  with  the 
software  radio.  The  first  parallelizes  the  frequency  search  space  [10].  In  this  case  the  raw  data  is  multiplied  by  the 
spreading  code  with  a  code  phase  from  the  test  matrix,  then  the  Fourier  transform  of  the  resulting  data  set  is  taken. 
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All  possible  frequency  bins  are  checked  for  the  resulting  carrier  modulated  only  with  the  navigation  data  which,  if 
found,  would  indicate  the  proper  code  phase  had  been  utilized  and  the  bin  in  which  it  was  located  would  provide  the 
necessary  frequency  information.  There  is  no  longer  a  need  to  search  the  various  frequency  entries  of  the  test  matrix, 
however,  the  computational  requirements  of  the  Fourier  transform  is  substituted  for  the  reduced  search  space. 

The  second  technique  parallelizes  the  code  phase  search  also  through  the  use  of  the  Fourier  transform  [11]. 
The  raw  data  is  converted  to  in-phase  and  quadrature  baseband  components  using  a  frequency  entry  from  the  test 
matrix.  The  Fourier  transform  of  this  data  is  taken  and  multiplied  by  the  complex  conjugate  of  the  Fourier  transform 
of  the  spreading  code.  The  inverse  Fourier  transform  is  then  applied  to  revert  back  to  the  time  domain.  Since 
multiplication  in  the  frequency  domain  acts  as  convolution  in  the  time  domain,  the  resulting  data  represents  the 
circular  convolution  at  all  possible  code  phases  for  that  particular  frequency.  Although  this  requires  the  computation 
of  the  complex  Fourier  and  inverse  Fourier  transforms,  the  search  space  is  reduced  to  only  the  possible  frequency 
bins.  Since  the  almanac  data  can  often  provide  optimal  frequency  starting  points,  this  technique  can  significantly 
decrease  acquisition  times. 

The  digital  correlator/accumulator,  popular  in  the  great  majority  of  receivers,  make  these  acquisition 
techniques  impractical  since  there  is  no  access  to  the  data  prior  to  accumulation.  This  limitation  illustrates  the 
advantage  of  the  software  radio  approach.  Each  of  the  algorithms  were  coded  and  tested  with  the  raw  data  sets 
displayed  in  Figure  2.  Although  each  of  the  algorithms  correctly  identified  the  acquisition  parameters  for  all  of  the 
satellites  in  both  data  sets,  the  parallelized  code  phase  search  technique  greatly  reduced  the  exhaustive  search  times. 

The  ability  to  postprocess  the  raw  data  provides  interesting  plots  that  give  deeper  insight  into  the  acquisition 
Focess.  Figure  3  shows  the  Fourier  transform  of  the  GPS-SPS  data  depicted  in  Figure  2  post  multiplication  with  the 
correct  C/A  code  with  the  proper  code  phase.  This  removes  the  spreading  code  and  the  resulting  carrier  modulated 
only  with  navigation  data  appears  at  the  appropriate  frequency.  Figure  4  depicts  the 

same  results  for  data  from  both  GLONASS  frequencies.  It  is  important  to  note  that  the  acquisition  signal 
processing  software  implemented  is  applicable  to  either  GNSS.  Figure  5  depicts  the  exhaustive  acquisition  search 
results  for  all  entries  of  the  test  matrix  for  a  single  C/A  code  from  a  visible  satellite. 

The  digital  correlator/accumulator,  popular  in  the  great  majority  of  receivers,  make  these  acquisition 
techniques  impractical  since  there  is  no  access  to  the  data  prior  to  accumulation.  This  limitation  illustrates  the 
advantage  of  the  software  radio  approach.  Each  of  the  algorithms  were  coded  and  tested  with  the  raw  data  sets 
displayed  in  Figure  2.  Although  each  of  the  algorithms  correctly  identified  the  acquisition  parameters  for  all  of  the 
satellites  in  both  data  sets,  the  parallelized  code  phase  search  technique  greatly  reduced  the  exhaustive  search  times. 
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SV#6,  Code  Phase  =  4683  samples,  Frequency  =  1246000  Hz 
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Figure  3.  Post  Acquisition  FFT  of  Raw  GPS-SPS  Data  for  All  Five  Satellite  PRN  (C/A)  Codes 

The  ability  to  postprocess  the  raw  daU  provides  interesting  plots  that  give  deeper  insight  into  the  acquisition 
process.  Figure  3  shows  the  Fourier  transform  of  the  GPS-SPS  data  depicted  in  Figure  2  post  multiplication  with  the 
correct  C/A  code  with  the  proper  code  phase.  This  removes  the  spreading  code  and  the  resulting  carrier  modulated 
only  with  navigation  data  appears  at  the  appropriate  frequency.  Figure  4  depicts  the  same  results  for  data  from  both 
GLONASS  frequencies.  It  is  important  to  note  that  the  acquisition  signal  processing  software  implemented  is 
applicable  to  either  GNSS.  Figure  5  depicts  the  exhaustive  acquisition  search  results  for  all  entries  of  the  test  matrix 
for  a  single  C/A  code  from  a  visible  satellite. 

I  GNSS  SOFTWARE  RADIO  TRACKING  AND  DATA  PROCESSING 
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Figure  4.  Post  Acquisition  FFT  of  Raw  GLONASS  Data  for  Frequency  Channels  #21  &  #22 


After  the  acquisition  parameters  have  been  identified,  the  second  stage  in  processing  the  GNSS  signal  is 
tracking  and  data  demodulation.  As  a  participate  in  the  1996  AFOSR  Summer  Research  Program  two  distinct 
^proaches  were  developed  and  implemented  successfully  in  co-operation  with  the  AAMP  group  of  Wright 
Laboratories.  The  goal  behind  any  GNSS  tracking  algorithm  is  to  precisely  align  the  incoming  spreading  code  with  a 
locally  generated  version  and  also  to  properly  decode  the  navigation  data.  The  combinations  of  these  two  tasks 
allow  the  calculation  of  a  position  solution. 

The  first  technique  utilized  the  traditional  tracking  loop  architecture  popular  in  traditional  GNSS  receivers. 
This  consists  of  the  two  coupled  tracking  loops,  a  code  tracking  loop  and  a  carrier  tracking  loop.  The  code  tracking 
loop  follows  the  conventional  early-late  noncoherent  delay  lock  loop  structure  [12],  This  element  seeks  to  generate  a 
local  synchronized  version  of  the  incoming  spreading  code.  If  the  rate  of  the  incoming  code  changes  as  a  result  of 
the  line-of-sight  Doppler  frequency,  the  code  tracking  loop  adjusts  the  locally  generated  code  rate  accordingly.  This 
allows  the  CDMA  format  of  the  GNSS  signal  to  be  despread  and  fimher  processed.  The  carrier  tracking  loop  can  be 
a  frequency  or  phase  lock  loop.  Its  purpose  is  to  provide  a  frequency/phase  reference  to  the  code  tracking  loop  and 
demodulate  the  navigation  data. 

The  second  technique  was  developed  by  the  AAMP  group  at  Wright  Laboratories  to  track  both  the  code  and 
carrier  of  a  CDMA  signal  [13].  This  technique,  known  as  Block  Adjustment  of  the  Synchronizing  Signal  (BASS) 
despreads  the  GNSS  signal  format  and  demodulates  the  navigation  data.  The  code  tracking  portion  of  the  BASS 
techmque  differs  from  the  traditional  unplementation  since  the  locally  generated  code  rate  remains  fixed  at  the 
nominal  code  rate.  An  early  and  late  version  of  this  code  is  generated  and  mixed  with  the  incoming  signal  over  10 
code  periods.  The  ratio  of  the  powers  in  the  early  and  late  components  provides  the  additional  accuracy  needed  to 
adequately  track  the  incoming  code.  Also  this  ratio  can  indicate  that  the  incoming  code  has  slid,  as  a  result  of 
Doppler,  more  than  Vi  of  a  sample  out  of  synchronization  with  the  locally  generated  code.  When  this  happens,  rather 
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GPS-SPS  Signal  Acqusition:  SV#28.  Code  Miase  =  4380  samples,  Frequency  =  1.25MHz 


xl0‘ 


C 


Code  Phase  Shift  (samples)  Intermediate  Frequency  (MHz) 

Figure  5.  Results  of  Acquisition  Algorithm  Applied  to  GPS-SPS  Signal  over  Entire  Search  Space 


than  attempt  to  modify  the  rate  of  the  local  code  to  match  the  incoming  code  as  is  done  in  the  traditional  tracking 
loop,  the  data  set  is  simply  shifted  by  a  single  sample  in  appropriate  direction.  Using  this  technique,  the  locally 
generated  code  provides  a  ‘rough’  indication  of  code  position  (approximate  60  meters  using  a  5  MHz  sampling  rate) 
and  the  ratio  can  provide  a  more  precise  indication  used  for  the  position  estimate.  Carrier  tracking  is  accomplished 
in  a  manner  similar  to  a  frequency  lock  loop.  The  slope  of  the  resulting  baseband  signal  is  used  to  adjust  the  carrier 
frequency.  However,  navigation  bit  transitions  can  result  in  180  degree  phase  shifts.  When  a  navigation  bit  phase 
change  is  detected,  it  is  recorded  for  later  processing  of  the  navigation  data,  and  the  phase  change  is  compensated  so 
that  the  baseband  slope  can  still  be  used  to  accurately  predict  carrier  fi-equency.  This  slope  is  based  on  10  data 
points,  each  of  which  is  determined  from  5000  data  points  (period  of  the  spreading  code  at  5  MHz). 

The  data  decoded  using  both  of  these  techniques,  in  combination  with  the  accurate  local  code  estimates 
provide  the  necessary  puesdoranges  and  navigation  data  parameters  to  decode  a  position  solution.  The  GPS-SPS 
signal  specification  describes  the  format  of  the  broadcast  navigation  data  as  well  as  the  algorithms  to  be  used  in 
computation  of  a  position  solution  [14].  These  were  implemented  and  applied  to  the  collected  data  which  resulting 
in  a  solution  within  500  meters  of  the  true  position. 
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SUMMARY 


This  paper  has  presented  the  initial  phase  in  the  development  of  a  GNSS  software  radio.  The  advantages  of 
such  an  implementation  to  the  navigation  community  were  discussed  along  with  potential  development  obstacles.  To 
date,  a  software  radio  front  end  has  been  evaluated  and  various  signal  acquisition  and  tracking  algorithms,  including 
those  which  are  not  applicable  in  tradition  GNSS  receivCT  designs,  have  been  implemented  and  tested  successfully. 
This  has  established  a  position  estimate  based  on  software-only  processing  result  of  the  GPS-SPS  signal  accurate  to 
within  500  meter. 

The  next  step  in  the  development  is  the  real  time  implementation.  Thus  far  all  results  presented  have  been 
based  on  data  which  was  collected  and  postprocessed.  Although  this  is  extremely  effective  as  it  aUows  tfie  necessary 
software  debugging,  the  ultimate  goal  is  a  real  time  implementation.  This  is  currently  under  investigation  using  the 
Texas  Instruments  TMS320C80  Digital  Signal  Processor. 

In  addition  to  the  real  time  implementation,  the  data  collection  hardware  and  postprocessing  abilities  allow 
a  methodical  observation  of  the  results  of  any  processing  step  of  each  algorithm.  This  will  allow  better  analysis  into 
the  algorithms  and  should  result  in  processing  techniques  superior  to  those  used  in  traditional  implementations. 
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Abstract 


This  thesis  considers  the  behavior  of  energetic  and  inert  solids  subjected  to  simple  shear  loading.  Data 
from  a  torsional  split-Hopkinson  bar,  built  for  this  study,  was  reduced  to  determine  shear  stress  and  shear 
strain  characteristics  of  these  materials.  These  results  were  then  used  to  calibrate  a  constitutive  law  for 
stress,  including  the  effects  of  strain  and  strain  rate  hardening  and  thermal  softening.  A  one  dimensional 
finite  difference  study  of  shear  localization  was  performed,  modeling  the  effects  of  thermal  conductivity, 
viscoplastic  heating  and  Arrhenius  kinetics.  Results  revealed  shear  localization  and  reaction  initiation  in  the 
explosives  simulated.  Experimental  failure  of  the  inert  solids,  however,  occurred  at  shear  strains  significantly 
lower  than  those  predicted  by  theory.  This  has  been  attributed  to  the  presence  of  failure  mechanisms  other 
than  shear  localization,  which  were  not  included  in  the  theoretical  model.  It  is  concluded  that  the  tested 
energetic  materials  are  not  expected  to  shear  localize  or  initiate  under  the  conditions  considered. 
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EXPERIMENTAL  AND  NUMERICAL  STUDY  OF  SHEAR  LOCALIZATION  AS  AN  INITIATION 

MECHANISM  IN  ENERGETIC  SOLIDS 


Richard  J.  Caspar 


1  Introduction 

This  report  will  address,  experimentally  and  theoretically,  the  behavior  of  various  metals,  solid  explosive 
simulants,  and  solid  explosives  subject  to  simple  shear  loading.  In  addition,  this  report  will  consider  reaction 
initiation  in  the  energetic  materials  as  a  result  of  a  mechanism  known  as  shear  localization  or  shear  banding. 
In  this  section,  a  description  and  review  of  the  pertinent  work  is  given.  Two  softening  mechanisms  which 
lead  to  shear  banding  will  be  discussed:  viscoplastic  thermal  softening  and  void  nucleation  and  growth. 
Pertinent  work  performed  in  the  study  of  shear  localization  in  explosives  is  also  discussed. 

1.1  Overview 

The  motivation  for  this  report  lies  in  the  development  of  insensitive  munitions,  which  are  resistant  to 
accidental  detonation.  Insensitive  munitions  are  desired  for  many  reasons.  First,  insensitivity  lessens  safety 
risks  in  the  storage  and  handling  of  these  devices.  In  addition,  it  is  desired  to  prevent  sympathetic  detonation, 
in  which  the  detonation  of  one  device  causes  others  to  detonate.  Another  motivation  for  this  report  comes 
in  the  field  of  deep  earth  penetrators.  These  devices  are  designed  to  travel  through  tens  of  feet  of  rock, 
concrete  and  earth;  hence,  a  significant  amount  of  deformation  is  inherent  within  the  penetrator.  It  is  thus 
desired  to  design  these  munitions  to  be  insensitive  to  this  deformation. 

In  order  to  develop  insensitive  munitions,  it  is  necessary  to  more  fully  understand  the  behavior  of  ex¬ 
plosives.  As  full  scale  tests  on  explosives  are  often  costly  and  time  consuming,  it  is  desirable  to  develop 
computer  models  and  simple  bench-top  experiments  which  predict  the  deformation  and  initiation  of  these 
materials.  There  are  numerous  finite-element  packages,  such  as  EPIC  and  ABAQUS,  which  have  been  de¬ 
signed  to  predict  material  deformation.  To  date,  limited  data  exists  to  develop  constitutive  models  for 
explosives  to  use  as  input  into  these  packages.  One  of  the  foci  of  this  study  is  thus  to  determine  the  material 
properties  of  various  explosive  simulants  and  explosives.  These  properties  are  determined  through  the  use 
of  an  experimental  apparatus  known  as  the  torsional  split  Hopkinson  bar  (TSHB),  which  was  built  by  this 
author.  Results  obtained  using  this  apparatus  reveal  shear  stress-shear  strain  properties  at  strain  rates  of  10^ 
to  10'*  for  tested  materials.  In  addition,  photographs  of  the  deformation  are  taken  with  an  ultra-high 
speed  camera,  capable  of  framing  at  a  rate  of  2  million  frames  per  second,  to  observe  failure  mechanisms. 
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The  TSHB  has  previously  been  used  to  determine  material  characteristics  for  metals,  in  which  failure 
often  occurs  due  to  a  mechanism  known  as  shear  localization.  Shear  localization  is  also  known  to  be  one 
of  the  initiation  mechanisms  in  solid  explosives  [Field  et  al,  1982],  also,  it  is  one  of  the  least  understood 
mechanisms.  Much  of  the  studies  on  initiation,  however,  have  been  performed  under  shock  and  impact 
conditions,  in  which  the  stresses  within  the  explosives  are  much  greater  than  the  yield  stress,  thus  making 
the  effect  of  the  strength  of  the  materials  insignificant  [Prey  (1981),  Boyle  et  al.  (1989),  Chou  et  al.  (1991)]. 
As  a  result,  little  is  known  about  the  sensitivity  of  explosives  under  lower  stress  deformations,  where  the 
strength  of  the  material  becomes  significant.  In  deep  earth  penetrators,  it  is  surmised  that  explosives  undergo 
significant  deformation  at  high  strain  rates  and  relatively  low  stresses  (on  the  order  of  the  yield  stress  of  the 
material)  and  low  pressures,  in  which  the  material  strength  is  thought  to  affect  the  deformation  of  explosives. 
One  of  the  detonation  mechanism  expected  to  dominate  under  such  conditions  is  shear  localization.  An 
additional  goal  of  the  experimental  tests  is  thus  an  attempt  at  observing  shear  localization  in  explosives 
deforming  in  simple  shear. 

Figure  1  describes  the  mechanism  of  shear  localization.  In  Figure  la,  a  portion  of  an  undeformed  material 
IS  sketched  with  thin  lines  inscribed  on  its  surface.  When  this  material  is  sheared,  the  scribe  lines  begin 
to  slant  at  a  uniform  angle,  as  seen  in  Figure  lb.  This  form  of  deformation  is  known  as  homogeneous 
deformation.  Increased  straining  into  the  plastic  range  results  in  hardening  of  the  material.  In  addition. 


Figure  1.  Schematic  of  the  shear  localization  process,  (a)  Undeformed  grid  lines, 
(b)  Homogeneous  deformation,  (c)  Shear  localization 


if  there  is  a  geometric  discontinuity,  void,  scratch  or  some  other  material  weakness,  straining  near  that 
discontinuity  will  occur  at  a  higher  strain  rate,  which  also  hardens  the  material.  This  increased  local 
deformation,  however,  also  causes  plastic  heating  of  the  material.  If  the  straining  occurs  at  high  strain  rates 
(typically  greater  than  10^  s  ^),  there  is  not  enough  time  for  the  generated  heat  to  be  conducted  away. 
The  local  increase  in  heat  results  in  thermal  softening  of  the  material.  If  this  process  dominates  over  the 
hardening  due  to  strain  and  strain  rate  effects,  the  material  strength  decreases.  As  a  result  of  this  local 
softening  of  the  material,  deformation  is  localized  into  a  thin  planar  region,  as  depicted  by  the  scribe  line 
deformation  of  Figure  Ic.  This  final  process  is  known  as  shear  localization  or  shear  banding. 
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In  addition  to  the  experimental  tests  that  were  performed  in  this  report,  a  numerical  model  was  developed 
to  predict  the  deformation  of  a  material  in  simple  shear.  The  governing  equations  of  conservation  of  momen¬ 
tum  and  energy  are  used  in  this  model.  In  the  discussion  of  shear  localization  in  the  following  section,  two 
softening  mechanisms  will  been  discussed  which  are  known  to  lead  to  shear  banding:  thermal  softening  and 
microvoid  nucleation  and  growth.  In  this  report,  a  constitutive  law  for  the  shear  stress  is  utilized,  in  which 
the  effects  of  strain  and  strain  rate  hardening,  and  thermal  softening  are  included;  microvoid  nucleation  is 
neglected.  Since  it  has  been  shown  that  the  choice  of  a  particular  constitutive  law  does  not  significantly 
effect  the  results  [Wright  (1987),  and  Batra  and  Kim  (1991)],  a  simplified  power  law  will  be  implemented. 
Also  included  in  the  model  is  the  effect  of  thermal  conductivity  due  to  its  importance  in  achieving  accurate 
temperature  predictions  [Batra  and  Kim,  1991].  Finally,  exothermic  reaction  is  modeled  by  an  Arrhenius 
kinetic  law.  Despite  results  attesting  to  the  fact  that  localization  is  a  multidimensional  process  [Marchand 
and  Duffy  (1988)  and  Giovanola  (1988  a,b)],  a  one  dimensional  model  will  be  developed,  since  this  model 
is  sufficient  in  yielding  important  information  about  the  shear  band  temperature  profile  in  solid  explosives. 
The  pertinence  of  these  effects  is  discussed  in  the  following  sections. 

The  novelty  of  this  report  first  lies  in  the  testing  of  explosive  simulants  with  the  torsional  split-Hopkinson 
bar  and  the  determination  of  their  shear  stress  and  shear  strain  characteristics.  Also,  the  implementation  of 
Arrhenius  kinetics  in  the  study  of  simple  shear  deformation  of  explosives  is  new.  In  addition,  most  researchers 
studying  simple  shear  deformation  have  used  a  finite-element  formulation  for  solving  the  governing  equations. 
In  this  report,  the  equations  are  solved  by  a  finite-difference  method. 

1.2  General  Reviews 

This  section  presents  a  brief  review  of  works  on  the  high  strain  rate  behavior  of  materials  as  well  as 
the  initiation  mechanisms  in  solid  explosives.  A  compilation  of  works  on  shock  wave  and  high  strain  rate 
phenomena  in  materials  is  presented  by  Meyers  et  al.  (1992).  Meyers  (1994)  also  discusses  these  dynamic 
events  in  materials.  These  books  discuss  various  failure  mechanisms  occurring  at  high  strain  rate  in  materials, 
including  shear  localization.  In  addition,  works  studying  high  strain  rate  effects  in  explosives  are  discussed. 

Bowden  and  Yoffe  (1985)  performed  an  extensive  review  of  experimental  works  on  explosive  mechanics  in 
order  to  categorize  the  various  mechanisms  of  initiation  in  solid  explosives.  They  concluded  that  initiation 
could  occur  by  the  adiabatic  compression  of  small  entrapped  gas  bubbles;  the  formation  of  hot  spots  on 
confining  surfaces,  extraneous  grit  particles  and  intercrystalline  friction  of  explosive  particles;  and  the  viscous 
heating  of  rapidly  flowing  explosive  as  it  escapes  impacting  surfaces.  Many  authors,  however,  have  discounted 
gas  compression  as  the  controlling  mechanism  for  hot  spot  formation  [Frey,  1985  and  Kang  et  al,  1992]. 

Field  et  al  (1982)  performed  impact  tests  on  thin  layers  of  several  explosives,  reporting  photographic 
evidence  for  the  formation  of  initiation  due  to  many  of  the  previously  stated  mechanisms,  as  well  as  some 
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additional  mechanisms.  In  these  tests,  the  explosives  were  subject  to  stresses  significantly  larger  than  the 
yield  stresses  of  the  materials.  They  concluded  that  the  following  mechanisms  contributed  to  ignition  of 
explosives:  adiabatic  shear  banding,  adiabatic  heating  of  gas  spaces,  viscous  flow,  frictional  rubbing,  hot 
spots  at  crack  tips  and  triboluminescence.  The  authors  found  that  viscous  heating  could  play  an  important 
role  in  liquids,  but  could  only  lead  to  significant  heating  in  solids  when  considered  in  conjunction  with  the 
other  mechanisms  listed.  They  also  determined  that  the  propagation  of  cracks  alone  would  not  lead  to 
ignition.  Instead,  they  proposed  that  fracture  of  an  explosive  crystal  would  produce  a  gaseous  void  which 
could  in  turn  lead  to  adiabatic  heating  and  hot  spot  generation.  In  conclusion,  the  authors  note  that  no  one 
mechanism  is  the  dominant  means  of  ignition,  and  that  small  changes  in  the  experimental  conditions  can 
lead  to  the  formation  of  hot  spots  due  to  different  mechanisms.  For  detailed  reference  on  detonation  theory 
with  discussion  of  experiments,  see  Picket  and  Davis  (1979). 

1.3  Adiabatic  Shear  Banding  in  Metals 

Shear  banding,  as  an  initiation  mechanism  in  explosives,  is  only  simply  understood.  It  is  known  that 
metals  subject  to  high  strain  rate  loading  in  association  with  high  speed  machining,  cutting  and  forming, 
as  well  as  in  impact  and  penetration,  often  experience  highly  localized  plastic  deformation,  known  as  shear 
bands.  The  thickness  of  these  shear  bands  is  typically  on  the  order  of  micrometers,  and  they  have  been 
known  to  develop  in  times  on  the  order  of  microseconds  [Marchand  and  Duffy  (1988),  Giovanola  (1988  a,b)]. 
Due  to  the  significant  amount  of  localized  viscoplastic  work  on  such  a  short  time  scale,  highly  localized 
temperatures  of  about  1000°C  are  observed.  Although  formation  of  these  shear  bands  is  typically  followed 
by  fracture,  failure  in  any  accepted  sense  of  the  word  occurs  with  formation  of  the  shear  band  since  the 
material  has  lost  its  load  carrying  capacity  [Marchand  and  Duffy,  1988].  Hence,  a  significant  amount  of 
research  has  been  performed  on  shear  banding  as  a  failure  mechanism  in  structural  materials. 

1.3.1  Experimental  Observations 

It  is  generally  understood  that  adiabatic  shear  localization  begins  because  thermal  softening  dominates 
over  strain  and  strain  rate  hardening  in  the  deformation  process.  When  a  material  is  plastically  strained, 
dislocations  begin  to  slip  within  the  material,  accumulating  at  grain  boundaries.  As  these  dislocations 
coalesce,  there  becomes  less  room  for  dislocations  to  slip,  thus  resulting  in  strain  hardening  of  the  material 
[Lubliner,  1990].  If  the  rate  at  which  this  straining  increases,  viscous  stress  will  further  resist  deformation, 
which  contributes  to  a  process  known  as  strain  rate  hardening.  An  important  result  of  dislocation  slip  is  the 
generation  of  heat.  Rogers  (1979)  has  concluded  that  approximately  90%  of  this  plastic  work  is  converted  into 
heat,  while  the  remaining  is  stored  in  the  generation  and  arrangement  of  dislocations.  This  increase  in  heat 
tends  to  free  the  motion  of  dislocations,  resulting  in  thermal  softening  of  the  material.  The  generation  of  heat 
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then  results  in  further  plastic  strain,  which  causes  further  increases  in  temperature.  At  high  enough  strain 
rates,  there  is  not  enough  time  for  the  heat  generated  to  be  conducted  away;  the  deformation  thus  becomes 
adiabatic.  If  a  material  is  experiencing  high  strain  rate  adiabatic  deformation,  and  there  is  a  material  or 
geometric  weakness,  such  as  a  void  or  scratch,  straining  will  increase  locally,  causing  an  instability.  If  the 
properties  of  a  given  material  are  such  that  the  mechanism  of  thermal  softening  dominates  over  strain  and 
strain  rate  hardening,  deformation  will  localize  into  a  planar  region,  known  as  a  shear  band. 

Zener  and  Hollomon  (1944)  were  among  the  first  to  describe  the  process  of  shear  localization  in  detail. 
They  state  that  a  necessary  condition  for  shear  localization  is  when  a  maximum  in  the  homogeneous,  adiabatic 
stress  strain  curve  exists,  beyond  which  deformation  cannot  be  homogeneous  and  strength  decreases  with 
increasing  strain.  When  the  strain  at  this  maximum  is  surpassed,  an  instability  will  then  arise,  in  which 
a  region  deforms  at  a  greater  rate  than  the  surrounding  material,  causing  it  to  weaken  and  further  strain, 
while  the  surrounding  material  is  no  longer  strained. 

In  order  to  study  the  deformation  and  temperature  distribution  across  a  shear  band,  Marchand  and 
Duffy  (1988)  performed  tests  in  simple  shear  on  thin  tubular  specimens,  by  means  of  a  torsional  split- 
Hopkinson  Bar.  Previous  applications  of  this  device  only  resulted  in  average  values  of  shear  stress  and  shear 
strain.  Marchand  and  Duffy  were  of  the  first  researchers  to  perform  detailed  measurements  of  the  shear  band 
formation.  They  used  high  speed  photographs  to  study  the  deformation  of  fine  lines  etched  on  the  specimen’s 
surface  in  order  to  determine  the  local  shear  strain.  From  an  analysis  of  their  photographs,  they  concluded 
that  shear  banding  occurs  in  three  distinct  stages.  In  the  firsl  stage,  the  material  is  undergoing  homogeneous 
deformation,  in  which  the  grid  lines  are  inclined  at  a  constant  angle.  In  the  second  stage,  the  material 
undergoes  inhomogeneous  deformation,  in  which  the  etched  lines  are  curved.  As  deformation  continues 
in  this  stage,  there  is  a  continuous  increase  in  the  localized  strain,  while  the  width  of  the  inhomogeneity 
decreases.  It  is  important  to  note  that,  in  this  stage,  the  deformation  remains  uniform  in  the  circumferential 
direction.  The  decrease  in  the  stress,  however,  is  never  large  over  this  region.  The  third  stage  begins  at 
the  value  of  nominal  strain  where  the  stress  first  starts  to  drop  severely.  The  deformation  becomes  severely 
localized  in  a  thin  plane.  It  is  at  this  time  that  the  one  dimensional  assumption  of  localization  breaks  down. 
Marchand  and  Duffy  observed  that  the  axial  position  of  the  maximum  local  strain  is  not  the  same  for  all 
points  in  the  circumferential  direction  indicating  that  the  shear  band  originates  in  several  locations  or  that 
it  originates  in  one  location  and  propagates  around  the  circumference  of  the  specimen. 

The  mavimiim  shear  strain  reached  in  the  shear  band,  with  an  applied  shear  strain  rate  of  1600  is 
1900%,  with  a  shear  band  width  found  to  be  20  fim.  These  large  strains  demonstrate  the  weakening  effect 
of  shear  banding.  For  dynamic  loadings,  deformation  is  often  localized  to  a  small  region,  which  is  forced 
to  absorb  the  majority  of  the  deformation.  As  a  result,  the  strength  of  the  whole  specimen  is  not  utilized, 
causing  failure  at  much  lower  nominal  strains  than  in  quasi-static  experiments. 
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Maxchand  and  Duffy  also  performed  temperature  measurements  across  the  shear  band  with  infra-red 
radiation  detectors.  Results  for  an  applied  strain  rate  of  1400  s'l  revealed  a  temperature  spike  at  the 
location  of  maximum  shear,  with  a  maximum  recorded  temperature  of  590°C.  This  value  represents  an 
average  temperature  over  a  region  which  is  greater  than  the  width  of  the  shear  band;  from  a  calculation 
taking  into  account  the  width  of  the  shear  band,  temperatures  as  high  as  1000°C  are  surmised. 

In  a  similar  study,  Giovanola  (1988,a)  performed  tests  on  VAR  4340  steel.  He  used  high  speed  pho¬ 
tography  to  observe  the  deformation  and  infrared  detection  to  determine  temperature  measurements.  In 
accord  with  Maxchand  and  Duffy,  Giovanola  determined  a  maximum  shear  strain  at  failure  of  2000%.  It  is 
also  pertinent  to  note  that  Giovanola  observed  inhomogeneous  deformation  prior  to  shear  banding,  as  was 
observed  by  Maxchand  and  Duffy  (1988).  In  a  companion  paper,  Giovanola  (1988, b)  performs  fractographic 
and  metallographic  observations  of  the  failure  surface  to  determine  that  shear  banding  is  a  result  of  thermo¬ 
plastic  instability  and  microvoid  nucleation  and  growth.  In  addition,  Giovanola  notes  that  failure  occurred 
on  a  number  of  parallel  planes  connected  by  well  defined  steps.  This  observation  supports  the  claim  that 
shear  bands  nucleate  at  a  number  of  locations  in  the  softened  region  and  propagate  around  the  specimen. 

1.3.2  Theoretical  Predictions 

As  a  result  of  the  significant  amount  of  experimental  data  related  to  dynamic  simple  shear,  determined 
from  the  torsional  Hopkinson  bar,  and  due  to  the  mathematical  ease  in  its  modeling,  much  of  the  analytical 
work  m  the  study  of  shear  localization  has  been  performed  in  connection  with  thermoviscoplastic  simple 
shear  deformation.  In  order  to  theoretically  model  the  problem,  governing  equations  are  developed  to  model 
the  relevant  physical  conservation  principles.  These  principles  do  not  form  a  complete  set  and  axe  hence 
supplemented  by  constitutive  equations,  through  which  specific  materials  are  modeled.  A  review  of  past 
works  in  the  field  of  adiabatic  shear  localization  is  presented  by  Rogers  (1979). 

In  the  development  of  a  constitutive  model  for  stress,  there  are  two  schools  of  thought  as  to  the  method 
by  which  the  strength  is  softened.  The  first  theory  is  that  the  stress  is  reduced  as  a  result  of  heat  being 
generated  from  viscoplastic  deformation.  This  process  results  in  thermal  softening,  which  dominates  over 
strain  and  strain  rate  hardening.  Alternatively,  many  researchers  have  studied  the  softening  of  the  stress 
as  a  result  of  microvoid  nucleation  and  growth.  This  theory  is  typically  formulated  to  state  that  at  some 
critical  strain,  voids  begin  to  nucleate  in  the  material,  thus  reducing  the  cross  sectional  area,  and  hence 
the  strength  of  the  material.  Microvoid  nucleation  and  growth  is  generally  presented  in  conjunction  with 
thermal  softening,  microvoid  nucleation  being  the  trigger  for  thermal  softening.  Meyer  (1992)  gives  a  review 
of  some  of  the  constitutive  relations  which  have  been  used  for  high  strain  rate  applications.  In  the  present 
report,  a  numerical  model  will  be  developed  which  takes  into  account  softening  due  to  viscoplastic  work 
alone;  microvoid  formation  will  be  neglected. 


31-8 


In  existing  studies  of  thermal  softening  induced  shear  localization,  there  is  a  significant  amount  of  dis¬ 
crepancy  in  the  choice  of  the  constitutive  equation.  In  order  to  address  this  issue,  Wright  (1987)  compared 
the  results  of  four  commonly  used  viscoplastic  constitutive  relations.  The  constitutive  laws  used  were  1) 
an  Arrhenius  stress  law,  2)  the  Bodner-Partom-Merzer  law,  3)  a  simple  power  law,  and  4)  the  Litonski 
law,  which  were  all  calibrated  over  the  same  data.  He  found  that  the  results  were  both  qualitatively  and 
quantitatively  similar,  with  the  results  within  5%  of  each  other  for  shear  strain  rates  up  to  10^  a  value 
far  in  excess  of  the  calibration  conditions.  For  even  larger  values  of  shear  strain  rate  and  high  temperatures, 
the  Bodner-Partom-Merzer  law  begins  to  diverge  from  the  other  solutions.  Wright  states  that  since  there  is 
a  significant  difference  between  the  strain  rate  and  temperature  at  the  center  of  the  shear  band  from  those 
of  the  calibration  conditions,  the  actual  structure  of  a  real  shear  band  would  be  expected  to  be  somewhat 
different  fi:om  that  predicted  by  any  given  constitutive  law.  He  thus  thought  it  surprising  that  the  trends 
predicted  by  the  constitutive  laws  were  as  similar  as  found. 

Another  source  of  discrepancy  in  previous  researchers  is  in  the  role  of  thermal  conductivity.  To  clarify 
this  issue,  Batra  and  Kim  (1991)  compared  the  results  of  three  different  constitutive  relations,  while  varying 
the  thermal  conductivity.  In  this  study,  the  researchers  considered  a  thermoviscoplastic  block  undergoing 
one  dimensional  simple  shearing  deformations,  with  strain  and  strain  rate  hardening  and  thermal  softening. 
The  thickness  of  the  block  was  taken  to  vary  smoothly  with  a  5%  decrease  in  thickness  at  the  center.  The 
constitutive  laws  considered  were  the  Litonski  law,  the  Bodner-Partom  law,  and  the  Johnson-Cook  law.  Batra 
and  Kim  found  that  the  results  from  the  three  constitutive  laws  are  extremely  similar,  both  qualitatively 
and  quantitatively,  verifying  the  results  achieved  by  Wright  (1987).  As  a  result  of  these  studies,  we  have 
determined  that  the  use  of  a  simple  power  law  will  be  adequate  in  the  study  of  high  speed  deformation. 

Batra  and  Kim  also  reported  that  large  increases  in  the  value  of  the  thermal  conductivity  delay  the 
initiation  of  stress  collapse  and  slow  down  the  development  of  the  shear  band.  However,  for  realistic  values 
of  thermal  conductivity  there  is  little  effect  on  the  value  of  nominal  strain  at  which  stress  collapses,  and 
hence  localization  occurs.  In  contrast,  the  authors  found  that  the  rate  of  evolution  of  the  temperature  at 
the  center  of  the  specimen  decreases  with  increasing  thermal  conductivity,  resulting  in  significant  changes  in 
temperature  for  realistic  values  of  thermal  conductivity.  It  is  therefore  concluded  that  thermal  conductivity 
can  be  neglected  when  one  is  only  considered  with  the  timing  of  stress  collapse,  but  it  proves  to  be  crucial 
when  considering  shear  band  temperatures.  Since  this  current  report  is  studying  thermal  reaction  initiation 
in  explosives,  conductivity  will  play  an  important  role. 

Motivated  by  this  study,  Batra  et  al.  (1995)  performed  a  thermoviscoplastic  analysis  neglecting  thermal 
conductivity  in  order  to  rank  twelve  materials  according  to  the  critical  strain  necessary  to  reach  localization 
in  tubular  specimens  loaded  in  simple  shear.  A  finite-element  method  with  the  Johnson-Cook  constitutive 
law  was  used  in  these  calculations.  The  thickness  of  the  tube  was  taken  to  vary  smoothly  with  a  10% 
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decrease  at  the  center.  Batra  et  al.  assumed  that  the  shear  band  initiates  when  there  is  a  catastrophic  drop 
in  torque,  and  ranked  the  materials  according  to  the  corresponding  nominal  shear  strain.  They  found  shear 
bands  to  initiate  in  the  following  order:  tungsten,  S-7  tool  steel,  depleted  uranium,  2024-T351  aluminum, 
7039  aluminum,  4340  steel,  armco  iron,  carpenter  electric  iron,  1006  steel,  cartridge  brass,  nickel  200  and 
OFHC  copper,  when  tested  with  a  shear  strain  rate  of  5000  It  is  relevant  to  note  that  the  critical  strain 
was  dependent  on  the  size  of  the  initial  defect  as  well  as  the  finite  element  mesh  used.  The  relative  ranking 
of  the  materials,  however,  was  independent  of  these  parameters. 

In  order  to  study  the  development  of  a  shear  band,  Batra  and  Kim  (1992)  solved  a  nonlinear  system 
of  equations  for  a  thermoviscoplastic  block  with  the  Johnson-Cook  constitutive  law,  including  strain  and 
strain  rate  hardening,  thermal  softening  and  thermal  conductivity.  They  used  a  continuous  variation  in  the 
thickness  to  instigate  localization.  The  authors  found  the  same  three  stage  localization  process  as  Marchand 
and  Duffy  (1988)  with  transition  to  stage  two  occurring  at  the  time  the  stress  reached  its  maximum  value, 
and  stage  three  occurring  much  later,  when  the  stress  has  dropped  to  about  90-95%  of  its  maximum  value. 
Batra  and  Kim  also  observed  the  effects  of  varying  the  thickness  of  the  block.  They  found  that  the  defect 
size  has  a  stronger  influence  on  ductile  materials  than  on  less  ductile  materials,  but  in  all  cases,  it  has  a 
significant  effect  on  the  critical  strain  to  reach  localization. 

In  a  similar  study,  Clifton  et  al.  (1984)  used  a  simple  power  law  including  strain  and  strain  rate  hardening 
and  thermal  softening  but  neglected  thermal  conductivity  to  perform  a  study  on  the  critical  conditions  for 
shear  band  formation.  They  concluded  that  the  primary  factor  affecting  initiation  of  the  shear  band  is  the 
strain  hardening,  whereas  the  strain  rate  hardening  is  the  primary  factor  affecting  the  rate  of  growth  of 
the  shear  band.  Supplementing  these  observations,  Wright  and  Batra  (1985)  used  the  Litonski  flow  law  to 
explore  the  critical  strain  at  collapse.  In  accord  with  Clifton  et  al.  (1984),  Wright  and  Batra  determined 
that  strain  rate  plays  little  role  on  the  critical  strain  at  localization.  However,  they  found  that  the  size  of  an 
initial  temperature  perturbation  plays  a  significant  role,  with  the  larger  perturbation  causing  localization  to 
occur  at  a  smaller  critical  strain. 

1.4  Detonation  Mechanisms  in  Explosives 

As  was  stated  previously,  shear  localization  is  understood  to  be  one  of  the  mechanism  which  can  lead 
to  ignition  in  solid  explosives  subject  to  high  strain  rates  and  relatively  low  pressures.  Traditional  studies 
of  shear  localization  in  explosives  have  been  performed  in  conjunction  with  high  pressures,  representing  the 
conditions  undergone  in  shock  and  impact  loading.  Frey  (1981)  developed  a  model  to  describe  heating  in 
high  explosives  due  to  shear  banding.  He  used  a  linear  viscoplastic  constitutive  law,  neglecting  the  effects  of 
strain  rate  hardening.  The  strength  was  decreased  over  a  30°C  range  after  the  melting  point,  the  melting 
point  increased  linearly  with  pressure,  and  the  viscosity  was  dependent  on  both  temperature  and  pressure. 
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In  addition,  Prey  used  Arrhenius  kinetics  to  model  the  thermal  explosion.  This  material  was  then  deformed 
in  shear,  stimulating  localization  by  setting  the  strength  to  zero  over  a  small  region  within  the  deformation. 
Without  reaction,  Frey’s  model  achieved  a  maximum  temperature  which  turned  out  to  be  independent  of 
strength  of  the  material  and  thickness  of  the  initial  weakness.  These  parameters  did,  however,  affect  the 
rate  of  growth  of  the  shear  band.  The  author  found  that  the  factors  which  did  affect  the  temperature  were 
the  pressure,  strain  rate,  and  viscosity.  Prey  found  that  high  pressures  result  in  higher  temperatures.  In 
addition,  under  low  pressures,  it  would  be  very  difficult  to  reach  the  temperatures  in  a  shear  band  required 
to  instigate  thermal  explosion. 

In  experiments  performed  on  explosives  subject  to  lower  pressures  and  shear  deformation,  such  as  in 
drop  weight  tests,  Boyle,  Prey  and  Blake  (1989)  confirm  the  numerical  observations  of  Prey  (1981).  They 
find  that  pressure  and  velocity  do  indeed  have  a  strong  effect  on  the  initiation  of  solid  explosives.  Due  to  a 
comparison  of  explosive  materials,  they  also  verify  that  higher  viscosities  increase  the  sensitivity  to  explosion. 
In  a  more  recent  study,  Chou  et  al.  (1991)  studies  two  theories  for  the  impact  initiation  of  explosives:  shock 
initiation,  a  pressure  dependent  theory,  and  shear  initiation,  a  temperature  dependent  theory.  Chou  et  al. 
state  that  there  are  three  means  by  which  heat  can  be  generated  in  explosives:  shock  compression  energy, 
plastic  work  and  viscous  work.  They  state,  in  accord  with  previous  researchers,  that  plastic  work,  in  the 
absence  of  high  pressures,  would  not  generate  enough  heat  to  produce  thermal  explosion,  since  failure  occurs 
before  a  significant  amount  of  straining  occurs.  Chou  et  al.  further  state  that  once  a  material  has  reached  its 
melting  temperature,  the  effect  of  plastic  work  becomes  negligible;  heating  then  results  from  viscous  work, 
which  is  capable  of  increasing  the  temperature  well  above  the  melting  point.  It  is  known,  though,  that  brittle 
materials  become  more  ductile  under  pressure;  in  fact,  Chou  et  al.  state  that  pressure  can  raise  the  stress 
and  strain  to  as  much  as  ten  times  as  high  as  a  material’s  uniaxial  value.  This  effect  thus  increases  the 
importance  of  considering  heating  by  plastic  work,  when  considering  a  material  under  hydrostatic  pressure. 

In  order  to  compare  the  effect  of  shock  and  shear  initiation  in  impacted  explosives,  Chou  et  al.  developed 
a  numerical  model  similar  to  that  of  Frey  (1981)  and  simulated  the  impact  of  bare  and  covered  explosives 
subject  to  impact.  They  concluded  that  for  bare  explosives  impacted  by  a  projectile,  shock  initiation  is 
dominant  and  the  shear  effect  is  negligible.  For  covered  explosives  impacted  by  a  projectile,  viscoplastic 
heating  is  of  importance  and  shear  initiation  at  the  edge  of  the  plug  is  probable. 


2  Experimental  Method 

This  chapter  discusses  an  experimental  apparatus,  known  as  the  torsional  split  Hopkinson  bar  (TSHB), 
which  was  used  to  test  metals,  solid  explosive  simulants  and  solid  explosives  in  simple  shear.  An  analysis 
of  the  data  determined  from  this  apparatus  produces  average  shear  stress  and  shear  strain  characteristics 
of  the  tested  material,  for  a  range  of  shear  strain  rates  (10^  -  10^  s"^).  High  speed  photographs  are  taken 
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of  the  deformation  and  failure  of  the  specimens,  in  order  to  determine  their  failure  mechanism.  The  data 
will  be  used  to  determine  constitutive  parameters  for  input  into  the  numerical  model  which  is  presented  in 
the  next  section.  This  data  can  also  be  used  to  calibrate  the  various  constitutive  laws  used  in  finite  element 
packages  such  as  EPIC  and  ABAQUS,  which  can  be  used  in  the  modeling  of  explosive  mechanics. 

The  torsional  Hopkinson  bar  is  a  modification  of  an  apparatus  originally  discussed  by  Kolsky  (1949, 
1953).  In  his  device,  thin  cylindrical  wafer-like  specimens  were  placed  between  two  long  elastic  bars,  aligned 
along  a  common  axis.  The  specimen  was  loaded  by  propagating  a  compressive  pulse,  generated  by  impacting 
the  bar  with  a  cylindrical  projectile  of  the  same  material  and  equal  diameter,  down  one  of  the  bars  toward 
the  specimen.  A  similar  device  was  used  by  Harding  et  al.  (1960)  for  material  testing  in  tension.  The 
Hopkinson  bar  was  later  adapted  for  tests  in  torsion,  which  is  discussed  by  Hartley  et  al.  (1985). 

There  are  several  reasons  why  the  TSHB  is  appropriate  for  the  current  experiments  performed  on  solid 
explosives.  First  of  all,  in  torsional  loading,  the  maximum  stress  in  the  specimen  occurs  on  the  exterior  surface 
of  the  material.  The  largest  deformation  will  thus  occur  on  the  exterior  surface,  making  the  probability  of 
hot  spot  formation  greatest,  where  it  can  easily  be  observed.  Also,  the  TSHB  can  be  designed  to  produce 
a  torsional  pulse  of  almost  any  desired  length,  and  hence,  large  amounts  of  deformation  in  the  specimen 
are  possible.  In  addition,  shear  is  the  main  form  of  deformation  present  in  high  rate  deformation  events 
such  as  penetration;  hence,  it  is  desired  to  determine  stress-strain  characteristics  in  shear,  as  opposed  to 
compression  or  tension.  Compression  and  tension  test  results  can  be  converted  into  shear  data  by  using 
a  criterion  such  as  the  von  Mises  equivalence  relation,  but  this  is  not  valid  for  strains  above  20%  [Hartley 
and  Dufiy,  1985].  Further  disadvantages  of  compressive  testing  result  from  the  Poisson’s  ratio  effect,  which 
causes  radial  expansion  of  materials  loaded  in  compression.  Additional  radial  stresses  in  compressive  tests 
occur  due  to  frictional  effects  between  the  specimen  and  bar.  In  torsional  tests,  there  is  no  Poisson’s  ratio 
effect,  and  hence  no  radial  contraction  or  expansion. 

There  are,  however,  some  drawbacks  to  the  TSHB.  First,  tests  may  only  be  run  on  a  limited  range  of 
strain  rates  (10^  -  10**  5-^).  The  lower  limit  is  due  to  an  increase  in  the  noise  to  signal  ratio,  while  the 
upper  limit  is  due  to  the  elastic  limit  of  the  TSHB.  A  further  drawback  is  if  fracture  occurs  in  the  explosive 
specimen  too  soon,  localization,  and  hence  initiation,  is  less  likely.  In  addition,  the  data  analysis  for  this 
apparatus  assumes  homogeneous  deformation.  Hartley  et  al.  (1985)  have  shown  that  it  takes  a  few  refiections 
of  the  loading  pulse  from  the  ends  of  the  specimen  before  a  state  of  homogeneous  deformation  is  reached, 
thus  rendering  the  early  time  results  of  the  TSHB  inaccurate. 

2.1  Description  of  Apparatus 

The  TSHB  consists  of  two  elastic  cylindrical  bars:  an  incident  and  transmission  bar;  a  torsional  pulley; 
a  clamp;  and  a  specimen.  A  schematic  of  this  apparatus  is  included  in  Figure  2.  The  two  bars  are  aligned 
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along  a  common  axis,  with  a  thin  walled  cylindrical  specimen  of  known  geometry  joining  them.  The  end  of 
each  bar  in  contact  with  the  specimen  is  milled  to  produce  a  hexagonal  socket,  into  which  the  specimen  or 
an  adaptor  is  inserted.  The  adaptor  is  used  for  cases  in  which  it  is  desirable  to  glue  the  specimen  in  place, 
rather  than  grip  it  with  the  hexagonal  socket.  The  torsional  pulley  is  attached  to  the  end  of  the  incident  bar 
far  from  the  specimen,  and  the  clamp  is  placed  at  a  variable  distance  from  the  pulley,  typically  several  feet. 


The  clamp  is  used  to  prevent  rotation  of  the  incident  bar  while  the  torsional  pulley  is  rotated,  thus  storing 
a  torsional  pulse  in  the  bar  between  the  pulley  and  clamp.  The  sudden  release  of  the  clamp  propagates  an 
incident  shear  strain  pulse  down  the  incident  bar.  The  incident  pulse  reaches  the  specimen,  transmitting 
some  strain  through  the  specimen  into  the  transmission  bar  and  reflecting  some  back  into  the  incident  bar. 

The  two  elastic  beus  are  constructed  of  1  in  diameter  aluminum  7075-T6,  111  in  in  length.  At  the  ends 
joining  the  specimen,  a  hexagonal  socket,  of  width  0.5625  in  and  depth  0.25  in,  is  milled  into  the  bars.  Into 
these  sockets,  one  inserts  either  a  hexagonal  specimen  or  an  adaptor  machined  from  7075-T6  aluminum,  to 
which  cylindrical  specimens  are  glued.  The  hexagonal  specimen  (see  Figure  3)  or  adaptor  is  fixed  to  the  bar 
with  12  set  screws,  two  on  each  face  of  the  hexagon.  The  dimensions  of  the  specimens  that  were  used  in 
this  study  are  given  in  Figure  3.  In  Figure  3,  the  central  part  of  the  specimen  is  commonly  referred  to  as 


Figure  3:  Scaled  diagram  of  the  specimens  used  in  the  TSHB  tests:  (a)  hexagonal 
specimen,  (b)  cylindrical  specimen  (all  dimensions  in  inches). 
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the  gage  length,  while  the  ends  are  termed  flanges.  The  hexagonal  specimen  is  used  to  test  metals  and  the 
cylindrical  specimen,  due  to  its  ease  in  machining,  is  used  to  test  the  explosive  simulants  and  explosives.  The 
ratio  of  the  wall  thickness  of  the  gage  length  to  the  mean  diameter  of  the  gage  length  for  the  hexagonal  and 
cylindrical  specimens  are  0.04  and  0.17,  respectively.  The  elastic  bars  are  supported  along  their  length  by 
delrin  bearings,  which  are  mounted  on  adjustable  bearing  supports.  These  bearing  supports  are  then  fixed 
to  a  steel  I-beam  which  supports  the  whole  TSHB  apparatus. 

A  schematic  of  the  torque  generating  mechanism  is  included  in  Figure  4.  The  torsional  pulley  is  clamped 
to  the  incident  bar  by  means  of  a  frictional  fit.  A  hydraulic  hand  pump  is  used  to  pressurize  the  rams,  which 
lengthen,  transferring  force  into  the  cable,  which  in  turn  rotates  the  torsional  pulley.  When  the  clamp  is 
cuS^'ged,  this  action  stores  torsional  elastic  energy  in  the  incident  bar. 


Figure  4:  Scaled  schematic  of  the  torque  generating  mechanism  (all  dimensions  in 
inches). 


Integral  in  the  operation  of  the  TSHB  is  the  clamp  which  stores  the  torsional  pulse.  The  key  to  constant 
strain  rate  tests  is  rapid  release  of  the  clamp,  v^^hich  propagates  an  incident  torsional  pulse  towards  the 
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specimen.  In  order  to  more  fully  understand  the  operation  and  design  of  the  TSHB,  the  author  spent  the 
summer  of  1995  at  Eglin  AFB  in  the  Advanced  Warheads  Evaluation  Facility  (AWEF)  making  modifications 
to  the  design  of  their  TSHB.  With  the  knowledge  gained  through  these  efforts,  this  author  modified  previous 
clamp  designs  resulting  in  the  design  shown  in  Figure  5.  The  clamp  is  engaged  by  pumping  a  second  hydraulic 
hand  pump,  which  pressurizes  a  hydraulic  C-clamp,  which  in  turn  clamps  the  base  of  the  two  clamp  faces. 
This  action  transmits  pressure  through  the  clamp  faces  onto  the  Hopkinson  bar.  In  order  to  release  the 
clamp,  the  hydraulic  pressure  is  increased  until  the  break  element,  as  seen  in  Figure  5,  firactures,  causing  the 
release  of  the  clamp  faces.  Ideally,  the  incident  pulse  would  be  a  squeu'e  pulse  of  torsion,  with  instantaneous 
rise  and  fall  time  and  constant  magnitude,  thus  producing  deformation  in  the  specimen  at  a  constant  strain 
rate.  For  optimum  functioning  of  the  clamp,  it  is  desired  to  store  large  amounts  of  elastic  energy  in  the 
clamp  in  order  to  achieve  quick  fracture  of  the  break  element  and  consequently,  sudden  release  of  the  clamp. 
This  sudden  release  will  produce  a  pulse  with  a  short  rise  time  and  relatively  constant  magnitude. 


1.0  0.8  1.0  1-0 


Figure  5:  Schematic  of  the  clamp  for  the  TSHB  (all  dimensions  in  inches). 


In  designing  a  break  element,  it  is  desired  to  use  a  material  with  minimum  ductility,  but  not  so  brittle 
that  it  will  fail  before  the  clamp  is  tight  enough  to  store  the  desired  torque.  Hartley  et  al.  (1985)  state  that 


31-15 


functional  pin  materials  include  aluminum  6061-T6  and  2024-T6.  In  order  to  determine  the  effect  of  the 
break  element  notch  geometry  on  the  release  of  the  clamp,  the  author  performed  quasi-static  uniaxial  tension 
tests  with  V-notched  and  square-notched  elements.  The  break  element  is  secured  to  the  clamp  faces  with  a 
pin,  hence,  the  clamp  faces  are  free  to  rotate  relative  to  the  break  element.  The  element  thus  experiences 
almost  pure  tension,  validating  the  uniaxial  notch  geometry  tests.  Results  of  these  tests,  reported  in  Caspar 
(1996),  revealed  that  the  V-notched  element  performs  better  than  the  square  notched  element. 

The  break  element  that  was  implemented  into  the  clamp  design  was  machined  from  0.75  in  diameter 
aluminum  6061-T6  rod,  with  the  diameter  at  the  center  of  the  notch  reduced  to  approximately  0.360  in. 
The  clamp  faces  are  machined  from  4340  steel  hardened  to  about  45  on  the  Rockwell-C  scale.  In  clamping, 
application  of  vertical  forces  that  would  cause  bending  and  axial  pulses,  which  could  result  in  erroneous  data, 
are  avoided  by  allowing  the  clamp  to  move  relative  to  the  incident  bar.  This  is  accomplished  by  horizontal 
slots  in  the  clamp  face  guides,  as  seen  in  Figure  5.  In  addition,  the  clamp  faces  are  joined  to  the  guides 
by  pins,  allowing  the  clamp  faces  to  rotate  relative  to  the  guide,  and  hence  further  eliminating  axial  and 
bending  pulses  by  establishing  flush  contact  between  the  clamp  faces  and  the  incident  bar. 

A  typical  record  of  the  shear  strain  pulses  recorded  with  the  strain  gages  at  locations  A  and  B  (see 
Figure  2)  can  be  found  in  Figure  6.  The  rise  time  from  10%  to  90%  of  the  maximum  strain  is  determined 
from  this  data  and  subsequent  tests  to  be  range  from  30  —  50  ns.  In  this  figure,  it  can  be  seen  that  the 
reflected  pulse,  which  will  be  shown  to  be  proportional  to  the  shear  strain  rate  in  the  specimen,  is  essentially 
constant  in  magnitude  while  strain  is  being  transmitted.  In  addition,  since  the  transmitted  pulse  is  shorter 
than  the  reflected  pulse,  the  incident  pulse  is  sufficiently  long  enough  to  strain  the  specimen  to  failure. 


Time  (ns) 

Figure  6:  Typical  shear  strain  pulses  in  a  TSHB  for  1018  CRS  (Test  4). 
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2.2  Analysis 

The  values  of  the  shear  stress,  shear  strain  and  shear  strain  rate  experienced  in  the  specimen,  when 
homogeneous  deformation  in  the  specimen  is  assumed,  are  determined  from  an  analysis  of  the  strain  in  the 
incident  and  transmission  bars.  In  this  section,  the  subscript,  s,  is  used  to  denote  the  properties  of  the 
specimen. 

Hartley  et  al.  (1985)  have  shown  that  the  shear  strain  rate  in  the  specimen  is  proportional  to  the  reflected 
strain  in  the  incident  bar,  'Yr-  Integration  of  the  reflected  strain  over  time,  t,  then  provides  the  shear  strain 


in  the  specimen,  7*:  ^ 

7*  (*)  =  ^ 

where  c  is  the  elastic  torsional  wave  speed  in  the  incident  and  trcuismission  bars,  Ds  is  the  mean  diameter 
of  the  specimen,  L*  is  the  length  of  the  specimen,  and  D  is  the  diameter  of  the  incident  and  transmission 
bars,  and  f  is  a  dummy  variable  of  integration. 

Hartley  et  al.  have  also  shown  that  the  transmitted  pulse,  7t,  provides  a  measure  of  the  shear  stress  in 


the  specimen,  r*: 


where  G  is  the  shear  modulus,  and  is  the  thickness  of  the  specimen  wall. 


2.3  Data  Acquisition  and  Reduction 

The  shear  strain  pulses  in  the  incident  and  tramsmission  bars  are  recorded  by  means  of  electric  resistance 
strain  gage  Wheatstone  bridges,  which  are  extremely  sensitive  to  small  changes  in  strain.  The  strain  gages 
are  attached  at  the  midpoints  of  each  bar,  location  A  and  B  in  Figure  2.  With  the  gages  mounted  at  the 
midpoint  of  the  bar,  it  is  possible  to  record  a  pulse  without  its  reflection  overlapping  in  time,  as  long  as  the 
pulse  is  shorter  than  the  length  of  the  bar.  Since  the  clamp  is  mounted  between  the  torsional  pulley  and  gage 
station  A,  the  incident  pulse,  being  twice  as  long  as  the  stored  torque,  will  always  be  less  than  the  length  of 
the  bar.  In  addition,  since  gages  A  and  B  are  each  the  same  distance  from  the  specimen,  it  is  ensured  that 
the  reflected  and  transmitted  pulse  will  commence  at  approximately  the  same  instant  in  time.  There  is  also 
a  strain  gage  bridge  mounted  12  in  from  the  torsional  pulley,  gage  station  C  in  Figure  2.  The  purpose  for 
this  gage  is  to  record  the  stored  torque,  which  is  used  to  obtain  the  expected  strain  rate  incident  upon  the 
specimen.  Stations  A  and  B  consist  of  four  strain  gages;  one  set  of  2  torque  gages  is  mounted  diametrically 
opposite  another  set  on  the  surface  of  the  bar.  Each  gage  is  mounted  at  a  45°  angle  to  the  axis  of  the  bar. 
Strain  gages  mounted  this  way  will  record  only  shear  strain,  cancelling  any  axial  and  bending  strain  which 
may  be  present.  Gage  station  C  consists  of  one  set  of  strain  gages,  also  mounted  at  a  45°  angle  to  the  axis 
of  the  bar. 
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McConnell  and  Filey  (1993)  describe  the  functioning  of  a  Wheatstone  bridge  and  determine  the  following 
equation  which  calculates  the  shear  strain,  70,  from  the  change  in  the  bridge  output  voltage,  AEq: 


2AEo  (!  +  «)=' 

FVGaNg  K 


(3) 


where  F  is  the  gage  factor,  V  is  the  excitation  voltage,  Ga  is  the  amplifier  gain,  Ng  is  the  number  of  active 
gages  in  the  Wheatstone  bridge,  and  k  is  the  ratio  of  resistances  in  the  bridge. 

Each  bridge  is  wired  to  a  Measurements  Group  model  2311  signal  conditioning  amplifier,  which  sets  the 
excitation  voltage  and  gain.  The  amplifier  output  from  gage  station  A  is  split,  with  one  lead,  as  well  as 
the  amplifier  output  from  gage  station  B,  sent  to  a  Tektronix  model  TDS  420  digitizing  oscilloscope,  which 
is  downloaded  to  a  personal  computer.  This  data  is  reduced  by  Equations  (3),  (1),  and  (2),  to  determine 
nominal  shear  stress  and  shear  strain  characteristics  of  the  specimen.  The  other  lead  of  the  amplifier  output 
from  gage  station  A  is  sent  to  a  Hewlett  Packard  model  214A  pulse  generator.  The  incident  pulse  triggers 
the  pulse  generator,  which  sends  a  second  pulse,  delayed  by  some  specified  time,  to  trigger  a  Gordin  model 
607  light  source,  which  illuminates  the  deformation  of  the  specimen.  Photographs  of  this  process,  which  are 
used  to  determine  the  failure  mechanism  of  the  specimen,  are  recorded  with  a  Gordin  model  330A  ultra-high 
speed  camera. 

Results  from  the  TSHB  were  verified  by  Gaspar  (1996). 


3  Model  Equations 

This  chapter  introduces  the  model  equations  employed  in  this  research.  In  the  first  section,  the  physical 
problem  is  described.  The  assumptions  and  governing  equations  are  then  presented.  These  equations  are 
scaled  and  presented  in  nondimensional  form.  Finally,  the  numerical  method  used  to  solve  the  equations  is 
discussed. 

3.1  Governing  Equations 

The  current  theoretical  study  of  localization  is  being  used  to  compare  with  tests  performed  on  the 
deformation  of  specimens  in  the  TSHB.  Tests  have  been  performed  on  metals,  solid  explosive  simulants  and 
solid  explosives.  In  the  TSHB,  a  thin  walled  circular  cylinder  is  dynamically  loaded  in  torsion;  the  model  is 
thus  designed  to  simulate  these  conditions.  A  schematic  of  the  specimen  described  by  the  model  is  included 
in  Figure  7.  In  this  specimen,  which  has  length  L„  z  is  the  axial  distance  variable,  6  is  the  circumferential 
distance  variable,  and  r  is  the  radial  distance  variable.  This  specimen  is  loaded  by  linearly  increasing  the 
circumferential  velocity,  ve,  over  time,  t,  at  z  =  L,  to  some  constant  value,  ui,  while  holding  the  velocity 
fixed  at  ^:  =  0.  The  following  assumptions  are  made  in  developing  the  model: 
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Z=0  Vi  z=L5 

Figure  7:  Specimen  used  for  numerical  simulation. 

•  The  specimen  is  initially  unreacted,  unstressed,  and  at  ambient  temperature. 

•  As  we  are  dealing  with  a  pure  torsional  problem,  there  is  no  component  of  velocity  or  displacement  in 
the  radial  or  axial  direction:  Vr  =  Vz  —  Ur  =  Uz  =  0. 

•  Due  to  axisymmetry  and  the  thin  walled  geometry,  there  is  no  variation  in  the  circumferential  or  radial 

direction:  ^  ^  =  0. 

•  In  order  to  induce  localization,  we  allow  the  specimen  wall  thickness  to  vary  with  axial  position  z.  We 
make  the  ad  hoc  assumption  that  this  perturbation  is  sufficiently  small  so  as  not  to  introduce  gradients 
in  the  radial  or  circumferential  directions.  It  is  noted  that  alternative  methods  of  perturbation  which  do 
not  require  such  ad  hoc  assumptions,  such  as  perturbation  in  initial  velocity,  temperature,  displacement 
or  strain,  could  also  induce  localization. 

•  In  pure  torsion,  the  stress  tensor  reduces  to  one  component,  azs,  the  stress  on  the  axial  face  in  the 
circumferential  direction,  which  will  be  referred  to  as  the  shear  stress,  r . 

•  The  shear  strain  is  restricted  to  positive  values. 

•  Plastic  deformation  is  completely  converted  to  heat. 

•  Heat  is  only  transferred  in  the  axial  direction. 

•  The  material  undergoes  a  one-step  chemical  reaction,  with  A  denoting  the  unreacted  material,  and  B 
denoting  the  reacted  material. 

•  The  density,  p,  and  the  thermal  conductivity,  k,  are  equal  in  the  unreacted  and  reacted  material  and, 
along  with  the  specific  heats,  ca  and  cb,  they  are  constant. 
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Under  these  assumptions,  the  governing  equations  are  stated  below: 


dve  9 


dve  d 


=  7  , 

=  V0 


¥  =  (8) 
where  e  is  the  internal  energy,  is  the  heat  flux  in  the  axial  direction,  ug  is  the  displacement  in  the 
circumferential  direction,  7  is  the  shear  strain,  A  is  the  reaction  progress  variable,  and  T  is  the  temperature. 
The  parameters  lu,  Z,E,  and  R  are,  respectively,  the  thickness  of  the  specimen  wall  thickness,  the  kinetic  rate 
constant,  the  reaction  activation  energy,  and  the  universal  gas  constant.  Equation  (4)  models  the  conservation 
of  linear  momentum.  Equation  (5)  models  the  conservation  of  energy.  Equation  (6)  is  the  definition  of  strain. 
Equation  (7)  defines  velocity  as  the  time  derivative  of  displacement.  Finally,  Equation  (8)  is  an  Arrhenius 
kinetics  law. 

The  constitutive  equations  used  in  this  model  are: 


T 

=  aT‘''r^ 

dy 

dt 

^  dy 
'di  ' 

(9) 

,  dT 

Qz 

~  ’ 

(10) 

B 

=  T^aBa  +  mBBB  , 

(11) 

eA 

—  caT  +  b'a 

(12) 

—  cbT  bb 

(13) 

rriA 

< 

1 

1— ( 

II 

(14) 

tub 

=  A, 

(15) 

where  a  is  the  stress  constant;  subscripts  A  and  B  refer  to  the  unreacted  and  reacted  material,  respectively; 
ba  and  bb  are  the  internal  energies;  tua  and  ttib  axe  the  mass  fractions;  ca  and  cb  are  the  specific  heats;  and 
b°a  and  b°b  are  the  energies  of  formation.  Equation  (9)  is  a  constitutive  law  for  stress,  proposed  by  Clifton, 
Bt  al.  (1984)  where  v,  t],  and  /z  are  the  exponents  which  characterize  the  thermal  softening,  the  strain  and 
strain  rate  hardening,  respectively.  Equation  (10)  is  Fourier’s  law  of  heat  conduction.  Equation  (11)  is  a 
mixture  law.  Equations  (12)  and  (13)  are  the  constitutive  laws  for  energy.  Lastly,  Equations  (14)  and  (15) 
define  the  mass  fractions. 

The  following  boundary  conditions  are  used  in  this  model: 

««,0)=0.  vHt.L.)  =  <<<i 

Ui  t>ti 
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Ue  {t,  0)  =  0  ,  Ug  {t,  La)  = 


(wi  -  «o)  ^  +  Vot  t  <  ti 

{vi  —  Vq)  ^  +  Votl  +Vi{t  —  ti)  t>ti 


dT 


{t,La)  =  0 


t>0  . 


That  is,  vg  is  fixed  at  one  side  of  the  specimen  and  ramped  over  a  time  ti  from  some  arbitrarily  small 
velocity,  vq,  to  a  constant  value  vi.  The  boundary  conditions  on  displacement  are  determined  by  integrating 
over  time  the  boundary  conditions  on  velocity.  Finally,  the  boundary  conditions  on  temperature  are  such 
that  the  ends  of  the  specimen  are  insulated.  The  initial  conditions  are: 


vg  (0,2:)  =  Vo- 


ug{0,z)=0,  T{0,z)  =  To,  A(0,2)  =  0, 


where  the  specimen  is  initially  stress  free,  unreacted  and  at  a  uniform  temperature,  Tq. 

In  order  to  induce  localization  at  the  center  of  the  specimen,  the  thickness  of  the  tube  is  perturbed  so 
that  there  is  a  continuous  variation  in  its  thickness  with  the  thinnest  portion  being  at  the  center,  an  amount 
of  hp  less  than  at  the  edges.  The  exact  form  of  this  perturbation  is  as  follows: 


hp  r  (2'kz\ 

= -  y  [I  -  ( TrjJ  ■ 


Next,  the  governing  equations  are  reduced  through  insertion  of  the  constitutive  laws.  First,  by  differen¬ 
tiating  Equation  (6)  with  respect  to  time  and  Equation  (7)  with  respect  to  space,  and  equating  the  results, 
one  determines  the  following  expression  relating  shear  strain  rate  with  velocity  gradient: 

^  (19) 

dt  dz 

Now,  Equations  (6),  (9),  and  (19),  are  inserted  into  Equation  (4).  The  energy  equation.  Equation  (5),  is 
reduced  by  substitution  of  Equations  (8)-(13)  and  (19).  Finally,  Equations  (7),  (8)  and  (18)  are  restated: 


d  f  _  fdugy 

=  &  (■&) 


I  dvg  r  ^  dvg 


p  [ca  (1  -  A)  -I-  CfiA] 


\dz  }  dz  w  dz  \  dz ) 


+  Z  p[Q -\-{ca-cb)T]{1- , 


=  vg  , 


i  =  Z(l- A)  exp  (-;!,), 


where  Q  =  is  the  heat  of  reaction. 
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In  order  to  numerically  solve  this  system  of  equations,  a  spatial  discretization  was  performed  using  second 
order  central  differences.  The  result  is  a  parabolic  system  of  ordinary  differential  equations  in  time,  as 
shown  in  Caspar  (1996).  The  computer  code  LSODE  [Hindmarsh,  1983],  the  Livermore  Solver  for  Ordinary 
Differential  Equations,  was  used  to  step  forward  in  time  to  solve  this  system.  LSODE  solves  initial  value 
problems  for  stiff  or  nonstiff  systems  of  first  order  ordinary  differential  equations  using  the  Gear  (1971) 
method.  A  stiff  system  is  one  whose  ratio  of  largest  to  smallest  eigenvalues  in  the  locally  linearized  solution 
matrix  is  large.  That  is,  the  system  has  rapidly  growing  or  decaying  processes  that  occur  over  a  time  scale 
much  shorter  than  the  overall  time  scale  of  interest.  This  computer  code  is  thus  desirable  for  the  model 
presented  herein  since  the  process  of  shear  localization  occurs  over  a  much  shorter  time  than  the  overall  time 
of  interest.  The  results  of  this  code  were  validated  in  Caspar  (1996),  through  the  use  of  simplified  forms  of 
the  equations  which  had  exact  solutions,  and  also  by  comparing  results  with  those  of  other  researchers  on 
previously  tested  materials. 

4  Results 

This  chapter  will  present  results  determined  from  the  TSHB  as  well  as  those  from  the  theoretical  model. 
Experimental  results  for  tests  on  S-7  tool  steel  (TS)  will  first  be  presented,  with  comparisons  drawn  between 
the  numerical  and  experimental  results,  as  well  as  with  results  determined  from  other  researchers.  Results 
will  then  be  presented  for  tests  on  the  following  explosive  simulants:  a  PBX  cure  cast  simulant,  a  PBX 
pressed  simulant,  and  a  melt  cast  simulant  known  as  Filler-E.  These  simulants  are  used  to  approximate  the 
material  properties  of  PBXN  109,  PBX  9501,  and  tritonal,  respectively.  The  results  of  these  tests  will  be 
used  to  determine  approximate  parameters  for  the  constitutive  law  for  stress  used  in  this  thesis.  Finally, 
numerical  simulations  will  be  run  on  1018  CRS,  and  S-7  tool  steel  to  compare  with  previously  determined 
results,  and  simulations  will  be  run  on  the  aforementioned  explosives. 

4.1  Experimental  Results  on  Explosive  Simulants 

In  this  section,  results  determined  from  the  TSHB  on  tests  of  the  explosive  simulants  are  presented.  The 
strain  and  strain  rate  hardening  parameters  from  the  constitutive  law  for  stress  are  then  determined  from 
the  data.  It  is  important  to  note  that,  to  our  knowledge,  no  researchers  have  tested  any  of  these  materials 
in  torsion.  The  results  presented  in  this  section  are  thus  previously  unrecorded. 

4.1.1  Tests  on  the  PBX  Cure  Cast  Simulant 

The  results  of  tests  performed  at  various  shear  strain  rates  on  the  PBX  cure  cast  simulant  are  included  in 
Figure  8.  The  reported  results  are  characteristic  of  a  few  tests  performed  at  each  strain  rate.  In  this  figure, 
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test  40  was  performed  at  a  shear  strain  rate  of  415  s  ^  and  lasted  for  700  /zs,  the  full  length  of  the  loading 
pulse.  The  cause  for  the  initial  overshoot  in  shear  stress  at  a  shear  strain  of  0.05  is  unknown,  but  it  could 
be  a  material  characteristic,  or  due  to  the  loading  geometry.  Zener  and  Hollomon  (1944)  have  stated  that  a 
maximum  in  the  stress-strain  graph  is  indicative  of  the  formation  of  an  instability  in  the  deformation.  The 
specimen  in  this  test  never  failed,  and  since  the  shear  stress-shear  strain  curve  never  reached  a  maximum,  it 
is  assumed  that  no  instability  was  reached.  This  test  thus  provides  an  accurate  measurement  of  the  strain 
hardening  in  this  material.  Prior  to  performing  this  test,  a  line  was  drawn  axially  across  the  specimen.  Post 
test  examination  of  this  line  revealed  no  permanent  deformation  in  the  specimen.  It  is  thus  concluded  that 
this  material  behaves  in  a  nonlinearly  elastic  manner  over  the  sheEir  strain  rates  tested. 

Test  41,  which  was  performed  at  a  shear  strain  rate  of  1850  also  lasted  for  700  (is.  The  initial 
peak  in  the  shear  stress  at  a  shear  strain  of  0.28  is  a  result  of  the  overshoot,  as  seen  in  Test  40.  Upon 
post-test  examination  of  the  specimen,  a  small  tear  was  noticed  in  the  circumferential  direction  within  the 
gage  length.  This  is  the  consequence  of  an  instability,  which  could  thus  account  for  the  decrease  in  the  slope 
of  the  shear  stress-shear  strain  curve  after  a  shear  strain  of  0.75.  This  test  would  thus  provide  an  inaccurate 
measurement  of  the  material’s  strain  hardening  characteristic.  In  addition,  it  was  noticed  that  even  with 
the  onset  of  instability,  the  deformation  was  recovered,  indicating  purely  elastic  deformation. 

Test  42,  which  was  tested  at  a  shear  strain  rate  of  3500  lasted  for  about  400  (is.  The  peak  in  the 
shear  stress  at  a  shear  strain  of  0.5  is  believed  to  be  a  result  of  the  stress  overshoot.  The  specimen  in  this 
test  failed,  with  an  instability  thought  to  occur  around  a  shear  stress  of  0.8.  This  instability  prevented  the 
material  from  further  hardening,  hence  making  the  overshoot  appear  to  be  the  occurrence  of  the  instability. 
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instead  of  where  it  is  actually  thought  to  occur.  Examination  of  the  failure  surface  revealed  voids  visible 
to  the  naked  eye.  In  addition  it  was  observed  that  failure  did  not  occur  along  a  single  plane,  but  along  an 
irregular  surface,  as  if  the  material  were  torn  apart.  It  is  thus  doubtful  that  this  material  demonstrates  shear 
localization  under  the  given  loading  conditions. 

The  data  determined  herein  was  then  used  to  calibrate  the  constitutive  law.  Equation  (9).  It  is  important 
to  note,  however,  that  it  is  necessary  to  perform  more  tests  on  this  material  in  order  to  more  accurately 
calibrate  the  constitutive  law.  This  constitutive  law  introduces  the  strain  hardening  parameter,  t),  and 
the  strain  rate  hardening  parameter,  /x.  Since  it  is  believed  that  test  40  results  in  the  most  accurate 
characterization  of  the  strain  hardening,  rj  was  chosen  by  trial  and  error,  such  that  the  slope  determined  from 
the  constitutive  model  approximately  matched  the  slope  of  the  shear  stress-shear  strain  curve  determined 
from  this  test.  Due  to  the  onset  of  instability,  the  results  from  tests  41  and  42  are  unreliable  once  a 
maximum  shear  stress  is  attained.  The  results  up  to  the  maximum  stress  are  accurate,  however,  and  were 
used  to  determine  the  strain  rate  hardening  parameter,  ^i.  These  values  are  tabulated  in  a  later  section. 

4.1.2  Tests  on  the  PBX  Pressed  Simulant 

Figure  9  shows  the  results  from  tests  performed  by  the  TSHB  on  the  PBX  pressed  simulant,  where  the 
reported  results  are  characteristic  of  a  few  tests  performed  at  each  strain  rate.  Referring  to  this  figure,  test 


Nominal  Shear  Strain 


Figure  9:  Results  from  TSHB  tests  on  the  PBX  pressed  simulant. 

29,  which  was  performed  at  a  shear  strain  rate  of  300  lasted  for  about  350  /xs.  This  material  in  this  test 
also  exhibited  a  stress  overshoot,  occurring  at  a  shear  strain  of  0.15.  Observation  of  the  post  test  specimen 
revealed  a  planar  failure  with  a  rough  failure  surface  including  small  voids.  This  is  as  would  be  expected  in 
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microvoid  nucleation  induced  shear  localization.  The  instability  which  caused  failure  is  assumed  to  account 
for  the  peak  in  the  shear  stress-shear  strain  curve  at  a  shear  strain  of  0.065.  Test  38,  which  was  performed  at 
a  shear  strain  rate  of  2800  lasted  only  75  jUS,  due  to  the  high  strain  rate  deformation.  No  overshoot  was 
observed  in  this  test.  Examination  of  the  post-test  specimen  revealed  fragmentation  of  the  gage  length  as 
well  as  the  flanges.  Due  to  this  catastrophic  failure,  these  results  may  not  be  truly  indicative  of  the  material. 
By  a  fractographic  study  of  the  fragments,  it  was  determined  that  cracks  initiated  in  the  gage  length  and 
propagated  outward  into  the  flanges  at  an  angle  to  the  cylinder  axis,  which  is  indicative  of  brittle  failure. 

In  order  to  more  accurately  determine  the  order  of  events  in  the  failure  of  this  specimen,  high  speed 
photographs  were  taken  of  the  deformation  experienced  by  one  of  these  simulants.  In  the  test  during  which 
photographs  were  taken,  the  shear  strain  rate  was  2850  A  plot  of  the  transmitted  shear  strain  for 
this  test,  which  is  proportional  to  the  shear  stress  in  the  specimen,  is  included  in  Figure  10.  High  speed 


Figure  10:  A  plot  of  the  transmitted  shear  strain  for  Test  49  on  the  PBX  pressed 
simulant,  j  =  2850 

photographs  of  the  specimen  deformation  in  this  test  are  included  in  Figure  11.  In  these  photographs,  the 
vertical  black  line  to  the  right  of  the  gage  length  is  a  result  of  the  camera  removing  a  strip  of  light  from  each 
frame  for  other  purposes.  In  Figure  11,  the  photograph  labeled  t  =  0  fis  was  taken  when  the  incident  pulse 
first  reached  the  specimen.  The  photograph  labeled  t  =  167  fis  was  taken  some  time  after  the  transmitted 
strain,  as  depicted  in  Figure  10,  reached  a  maximum.  In  the  center  of  the  gage  length  of  the  specimen  in 
this  picture,  a  small  crack  is  visible.  From  the  complete  photographic  record,  not  included  in  this  thesis, 
this  crack  formed  at  approximately  t  =  113  /xs,  which  is  just  after  the  transmitted  shear  strain  reaches  a 
mayimiim,  as  seen  in  Figure  10,  which  indicates  that  the  drop  in  the  transmitted  strain  is  the  result  of  this 
failure  mechanism.  From  the  photograph  labeled  t  =  227  fis,  it  is  seen  that  the  crack  has  increased  in  size. 
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Figure  11:  High  speed  photographs  of  the  failure  of  a  PBX  pressed  simulant,  Test 
49. 

and  that  other  faint  cracks  have  formed  below  it.  Finally,  the  photograph  labeled  t  =  520  fxs,  which  was 
taken  of  a  separate  test  performed  at  a  comparable  shear  strain  rate,  reveals  the  ultimate  fragmentation  of 
the  specimen.  It  is  thus  confirmed  that  cracks  initiate  in  the  gage  length  and  propagate  outward  through 
the  fianges.  In  comparison  with  the  failure  of  the  lower  strain  rate  test.  Test  29,  it  is  concluded  that  the 
type  of  failure  for  this  material  is  dependent  on  the  loading  rate. 

In  order  to  develop  a  constitutive  model  for  this  material,  the  parameters  from  Equation  (9)  are  again 
approximated  to  match  the  constitutive  law  to  this  data.  The  value  of  t]  was  determined  from  test  29,  since 
the  material  in  this  test  exhibited  a  greater  amount  of  strain  hardening  prior  to  the  onset  of  instability. 
The  strain  rate  hardening  parameter,  n,  was  determined  such  that  the  constitutive  model  matched  the  peak 
stress  obtained  in  test  38. 

4.1.3  Tests  on  Filler-E 

Figure  12  shows  the  results  from  tests  on  Filler-E,  where  the  reported  results  are  characteristics  of  a  few 
tests  performed  at  each  strain  rate.  In  this  figure,  Test  30,  which  was  performed  at  a  shear  strain  rate  of 
370  lasted  375  fis.  Fragmentation  of  the  Filler-E  specimen  occurred  in  much  the  same  way  as  did  in 
the  high  strain  rate  test  on  the  PBX  pressed  simulant.  Cracks  appear  to  have  begun  in  the  gage  length 
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Figure  12:  Results  from  TSHB  tests  on  Filler-E. 


and  propagated  through  the  specimen  at  45°  to  the  axis  of  the  specimen.  On  tests  when  there  was  minimal 
fragmentation,  it  was  possible  to  observe  the  failure  surface,  which  was  irregular,  not  at  all  indicative  of 
shear  localization.  On  test  33,  which  was  performed  at  a  shear  strain  rate  of  1500  s~^,  deformation  lasted 
120  ^s.  From  Figure  12,  it  is  seen  that  this  material  has  a  high  sensitivity  to  strain  rate.  Tests  at  this  strain 
rate  caused  significant  fragmentation,  with  failure  similar  to  that  for  the  low  strain  rate  test.  Although 
Filler-E  demonstrates  strain  and  strain  rate  hardening,  the  constitutive  law  could  not  be  fit  to  a  significant 
portion  of  the  shear  stress-shear  strain  curves  seen  in  Figure  12.  This  is  due  to  the  fact  that  Filler-E  does 
not  demonstrate  strain  hardening  that  can  be  represented  in  the  form  of  a  power  law. 

4.2  Numerical  Simulations  on  Nonreactive  Materials 

This  section  presents  the  results  from  numerical  simulations  on  PBXN-109  and  PBX  9501  with  the 
effects  of  reaction  excluded.  The  constitutive  and  material  parameters  used  in  these  numerical  simulations 
are  included  in  Table  1.  The  material  parameters  for  PBX  9501  were  taken  from  Dobratz  and  Crawford 
(1985),  and  the  materiEd  parameters  for  PBXN-109  and  tritonal  were  taken  from  Hall  and  Holden  (1988). 
The  thermal  conductivity  for  PBXN-109  was  not  found,  so  the  value  for  PBXW-114,  a  material  of  similar 
composition,  was  used.  In  order  to  determine  the  thermal  softening  function  for  some  of  the  materials 
they  tested,  Johnson  and  Cook  assumed  a  linear  decrease  in  the  stress  as  a  function  of  temperature,  with 
the  stress  reaching  zero  at  the  melting  point.  Since  PBX  9501  and  PBXN-109  react  before  melting,  their 
thermal  softening  parameters  are  estimated  such  that  the  stress  is  decreased  by  50%  at  the  reaction  initiation 
temperature.  Prom  Dobratz  and  Crawford,  PBX  9501  reacts  at  240°C,  and  from  Hall  and  Holden,  PBXN-109 
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Material 

Constitutive  Parameters 

Material  Parameters 

a 

(MPog-) 

V 

V 

Hi 

k 

\mKl 

PBX  9501 

33,000 

-1.28 

0.320 

0.080 

1840 

1130 

0.454 

PBXN-109 

800 

-1.38 

0.400 

0.320 

1670 

1260 

0.104 

Tritonal 

- 

-9.68 

- 

- 

1690 

960 

0.460 

Table  1:  Constitutive  and  material  parameters  used  in  the  numerical  calculations. 


reacts  at  220°  C.  The  thermal  softening  parameter  for  tritonal  was  estimated  such  that  the  stress  is  decreased 
by  90%  at  the  melting  temperature  of  80°C.  The  remaining  constitutive  parameters  were  determined  from 
the  experimental  results  on  the  explosive  simulants,  as  described  in  the  previous  section. 


4.2.1  PBX  9501  Without  Reaction 

Next,  numerical  simulations  were  performed  for  the  deformation  of  PBX  9501,  with  the  effects  of  reaction 
excluded.  This  was  studied  in  order  to  compare  with  the  experimental  results  on  the  PBX  pressed  simulant 
and  to  determine  the  material’s  susceptibility  to  localization.  Reaction  was  excluded  by  setting  Z  equal  to 
zero  in  the  computer  code.  The  physical  constants,  included  in  Table  2  under  simulation  #1,  were  chosen  to 
match  the  experimental  conditions.  For  this  case,  the  test  was  performed  at  a  shear  strain  rate  of  2800 


Simulation 

Vi 

Ls 

h 

Wo 

hp 

Vo 

Number 

{m/s) 

{mm) 

{ps) 

{mm) 

{m/s) 

1 

7.00 

2.50 

32.14 

2.50 

0.10 

7.00  X  10-2 

2 

6.25 

2.50 

32.00 

2.50 

0.10 

6.25  X  10-2 

Table  2:  Physical  constants  used  in  the  numeric^d  simulations  reported  within  this 
thesis. 


In  order  to  determine  the  onset  of  localization,  the  following  localization  criterion,  determined  by  Meyers 
(1994),  is  used: 


dr  dr 

d-y  dj 

which  is  evaluated  at  the  center  of  the  specimen.  The  right  hand  side,  $,  and  left  hand  side,  of  this 
criterion  are  plotted  as  functions  of  time  in  Figure  13.  In  this  criterion,  ^  is  a  combined  measure  of  the 
strain  and  strain  rate  hardening  effects,  while  the  #  is  a  measure  of  the  thermal  softening.  It  is  seen  that  both 
$  and  $  are  always  positive,  indicating  that  the  material  is  experiencing  strain  and  strain  rate  hardening 
as  well  as  thermal  softening,  as  expected.  When  4-  is  less  than  or  equal  to  $,  this  criterion  predicts  that 
localization  will  begin.  For  this  test,  the  onset  of  localization  is  reached  after  1.67  ms. 


d'y/dt\^ 


< 


dr 


pcA  dT 


(25) 
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Figure  13:  Localization  criterion  for  PBX  9501  without  reaction,  where  #  represents 
thermal  softening  and  'S'  represents  strain  and  strain  rate  hardening. 

The  effects  of  localization  are  readily  seen  by  studying  the  evolution  of  the  velocity  and  temperature 
profile,  as  seen  in  Figures  14  and  15.  Figure  14  shows  the  three  stage  localization  process,  which  was  initially 
observed  experimentally  by  Marchand  and  Duffy  (1988).  After  the  velocity  at  z  =  L  reaches  its  final 
value,  the  profile  essentially  forms  a  linear  distribution  in  space,  which  is  called  homogeneous  deformation. 
Marchand  and  Duffy  have  termed  this  Stage  I  of  the  localization  process.  Since  the  specimen  is  thinnest  at 
its  center,  it  is  also  weakest  at  that  point.  The  material  is  thus  locally  less  resistant  to  deformation,  hence 
developing  an  inhomogeneous  velocity  profile  with  the  greatest  slope  at  the  center.  This  is  referred  to  as 
Stage  II.  The  Stage  II  localization  is  very  subtle  in  this  test  and  is  not  readily  observed.  Now,  the  shear 
strain  rate  is  equal  to  the  slope  of  the  velocity  profile,  so,  as  it  increases,  it  causes  the  stress  to  increase.  The 
combined  effect  of  the  increased  shear  stress  and  shear  strain  rate  cause  the  temperature  to  increase,  as  is 
seen  in  Figure  15.  The  rise  in  temperature  causes  the  stress  to  drop,  which  results  in  further  straining  and 
heating.  This  interaction  continues  until,  after  1.67  ms,  the  thermal  softening  dominates  over  the  strain  and 
strain  rate  hardening.  Consequently,  deformation  rapidly  localizes  to  a  narrow  region,  termed  Stage  III. 

It  is  this  final  stage  of  localization  that  is  termed  shear  localization,  or  shear  banding.  At  this  time, 
the  rate  of  change  in  the  temperature  increases  dramatically  at  the  center  of  the  shear  band,  resulting  in  a 
pronounced  spike  in  temperature.  The  temperature  at  the  center  of  the  shear  band  at  the  onset  of  localization 
is  458  K.  After  3.2  ms,  the  temperature  at  this  point  has  increased  to  1590  K.  It  is  interesting  to  note 
that  the  temperature  of  the  material  at  the  onset  of  localization  has  already  almost  reached  its  initiation 
temperature  of  513  K,  and  that  following  localization  the  temperature  far  surpasses  this  value.  Hence,  it  is 
expected  that  shear  localization  in  PBX  9501  would  produce  initiation  of  reaction. 
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These  numerical  results  are  now  compared  to  the  experimental  results  determined  from  the  PBX  pressed 
simulant.  Figure  16  compares  the  experimental  and  numerical  shear  stress  and  shear  strain  characteristics. 
From  this  figure,  it  is  seen  that  the  computer  code  predicts  the  shear  stress  and  shear  strain  characteristics 


Figure  16:  A  comparison  of  the  experimental  and  numerical  results  for  the  PBX 
pressed  simulant. 

fairly  accurately  until  just  before  failure.  The  code,  however,  does  not  predict  localization  to  begin  until  a 
nominal  shear  strain  of  4.63  is  reached,  as  compared  with  the  experimental  failure  at  0.20  shear  strain.  The 
full  numerically  determined  shear  stress-shear  strain  curve  is  included  in  Figure  17.  It  is  thus  concluded  that 
the  PBX  pressed  simulant  does  not  fail  due  to  shear  localization,  but  instead  due  to  some  other  mechanism. 
This  does  agree  with  experimental  observations,  which  suggested  that  failure  could  have  occurred  due  to 
microvoid  nucleation  and  growth,  crack  propagation  or  fragmentation.  Since  these  failure  mechanisms  were 
not  built  into  the  numerical  model,  the  code  can  not  accurately  predict  this  form  of  failure. 

4.2.2  PBXN-109  Without  Reaction 

Numerical  simulations  were  then  performed  on  PBXN-109,  with  the  effects  of  reaction  excluded.  The 
physical  constants,  included  in  Table  2  under  simulation  #2,  were  chosen  to  match  the  experimental  condi¬ 
tions  of  test  42,  with  a  shear  strain  rate  of  2500  Since  it  is  believed  that  the  PBX  cure  cast  simulant 
demonstrates  nonlinear  elastic  deformation,  it  is  questionable  whether  it  can  be  accurately  modeled  by  the 
numerical  method,  which  assumes  viscoplastic  heating.  Despite  this  fact,  simulations  were  performed  on  this 
material,  with  the  assumption  of  viscoplastic  heating,  in  order  to  learn  more  about  the  shear  localization  and 
thermal  initiation  processes.  The  localization  criterion  for  this  test  is  included  in  Figure  18,  which  predicts 
the  onset  of  localization  after  272  ms.  It  is  noticed,  however,  that  by  the  time  #  becomes  greater  than  ’4', 
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Figure  17:  Experimental  and  numerical  shear  stress-shear  strain  curves  up  to  failure 
for  the  PBX  pressed  simulant. 


Figure  18:  Localization  criterion  for  PBXN-109  without  reaction. 
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the  parameters  have  essentially  ceased  changing.  The  shear  strain  and  shear  strain  rate  hardening  effects 
have  thus  reached  a  balance  with  the  thermal  softening  effect.  Figure  19  shows  the  evolution  of  the  velocity 


Figure  19:  Evolution  of  the  velocity  field  for  PBXN-109  without  reaction. 


profile  for  this  test.  From  this  figure,  it  is  seen  that  Stage  II  is  reached,  but  that  transition  into  localization 
does  not  occur.  In  comparison  with  PBX  9501,  it  is  seen  that  PBXN-109  develops  a  greater  inhomogeneity 
in  Stage  II,  but  that  PBX  9501  is  more  susceptible  to  eventual  localization.  Further  iteration  in  time  reveals 
that  the  velocity  profile  begins  to  return  to  a  homogeneous  state,  rather  than  localizing.  Figure  20  shows 
the  evolution  of  the  temperature  profile,  which  shows  temperature  to  increase  in  an  inhomogeneous  manner. 
It  is  interesting  to  notice  that  the  temperature  exceeds  the  reaction  temperature  of  493  K  after  about 
10  ms.  It  is  thus  concluded  that  shear  localization  is  not  necessary  for  initiation  to  occur  in  this  material; 
£in  inhomogeneous  growth  in  temperature  can  eventually  lead  to  significant  increases  in  its  value.  However, 
the  time  over  which  this  process  occurs  is  significantly  longer  than  the  experimentally  recorded  time  to 
failure  of  350  ns.  It  is  hence  determined  that  the  PBXN-109  simulant,  as  did  the  PBX  9501  simulant,  failed 
experimentally  by  mechanisms  which  are  not  included  in  this  model.  Furthermore,  it  is  not  expected  that 
deformation  in  this  material  can  be  sustained  long  enough  to  increase  temperatures  into  the  reactive  range. 

4.3  Numerical  Simulations  on  Reactive  Materials 

Results  are  next  presented  for  numerical  simulations  on  reactive  materials.  The  reactive  parameters  for 
PBX  9501,  PBXN-109  and  tritonal  are  included  in  Table  3.  Dobratz  and  Crawford  (1985)  list  maximum 
calculated  and  experimentally  determined  values  of  Q  for  various  explosives.  They  also  tabulate  values 
of  Z  and  E  for  various  explosives,  but  not  for  any  of  those  tested  herein.  For  PBX  9501,  Dobratz  and 
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Figure  20;  Evolution  of  the  temperature  field  for  PBXN-109  without  reaction. 


Crawford  only  list  the  maximum  calculated  value  of  Q.  For  PBX  9404,  a  similar  explosive,  the  experimental 
value  is  88.5%  of  the  maximum,  hence,  the  Q  value  for  PBX  9501  is  chosen  to  be  88.5%  of  its  maximum 
calculated  value.  Since  PBX  9501  is  95%  by  weight  HMX,  the  values  of  Z  and  E  for  HMX  were  used. 
Neither  PBXN-109  nor  tritonal  are  included  in  Dobratz  and  Crawford’s  handbook.  Since  PBXN-109  is  64% 
RDX,  the  experimentally  determined  value  of  Q  and  the  values  of  A  and  E  for  RDX  were  used  to  simulate 
this  explosive.  Likewise,  as  tritonal  is  80%  TNT,  the  values  of  Q,  Z,  and  E  for  TNT  were  used  to  simulate 
tritonal. 


Material 

Z 

(«-^) 

Q 

{kJ/kg) 

E 

{kJ/mol) 

R 

{J/mol  ■  K) 

PBX  9501 

5.00  X  10^® 

5891 

220.6 

8.314 

PBXN-109 

2.02  X  10^® 

6320 

197.1 

8.314 

Tritonal 

2.51  X  10“ 

4560 

143.9 

8.314 

Table  3:  Reactive  constants  used  in  the  numerical  code. 


4.3.1  PBX  9501  With  Reaction 

Results  are  now  presented  for  simulations  of  PBXN-109  with  the  effects  of  reaction  included.  The 
same  parameters  were  used  in  this  simulation  as  in  the  nonreactive  case.  The  effects  of  including  reaction 
proved  to  have  little  effect  on  the  results  prior  to  initiation.  The  localization  criterion  predicted  localization 
after  1.682  ms,  a  difference  of  only  0.3%  from  the  nonreactive  case.  The  temperature  at  the  center  of  the 
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specimen  at  this  time  was  within  0.1%  of  the  corresponding  nonreactive  temperature.  As  was  anticipated 
by  the  nonreactive  case,  reaction  in  the  reactive  test  did  occur  shortly  following  the  onset  of  localization. 
The  evolution  of  the  velocity  and  temperature  profiles  for  this  material  appeared  very  similar  to  those  of  the 
nonreactive  case.  Computation  was  stopped  in  this  simulation  shortly  following  the  start  of  reaction,  since 
the  reaction  proceeded  so  quickly  that  the  time  step  rapidly  approached  zero. 

The  temperature  profile  for  this  simulation  is  similar  to  that  of  the  PBX  pressed  simulant,  with  reaction 
occurring  prior  to  severe  development  of  the  temperature  spike.  By  observing  the  evolution  of  the  reaction 
progress  variable.  Figure  21,  it  is  seen  how  sensitive  initiation  is  to  temperature.  Appreciable  reaction  did  not 


Figure  21;  Evolution  of  the  reaction  progress  for  PBX  9501  with  reaction. 


begin  until  the  reaction  temperature  was  reached,  at  which  time  reaction  quickly  initiates  in  the  localized 
hot  spot.  Bowden  and  Yoffe  (1985)  have  discussed  initiation  in  the  context  of  localized  hot  spots,  shear 
localization  being  only  one  of  the  mechanisms  by  which  hot  spots  are  generated.  Also,  it  is  observed  how 
quickly  reaction  proceeds,  with  the  reaction  increasing  over  10  times  its  value  in  the  last  25  fxs,  to  achieve 
approximately  1.0%  completion  at  the  center  of  the  specimen.  It  is  important,  however,  to  state  that  the 
nominal  shear  strain  reached  at  initiation  is  approximately  6.4,  whereas  the  simulant  failed  after  a  shear 
strain  of  0.2  experimentally.  As  stated  previously,  experimental  observations  suggested  failure  by  other 
mechanisms,  which  are  not  included  in  this  numerical  code. 

4.3.2  PBXN-109  With  Reaction 

The  final  material  studied  in  this  thesis  is  PBXN-109  with  reaction  effects  included.  Again,  the  same 
parameters  were  used  in  this  simulation  as  in  the  nonreactive  case.  A  study  of  the  velocity  profile  up  to 
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reaction  initiation  reveals  that  no  significant  inhomogeneity  has  yet  to  develop.  In  contrast  to  the  velocity 
profile,  the  temperature  profile  develops  an  inhomogeneity.  In  fact,  the  temperatures  in  this  inhomogeneity, 
although  not  significant,  have  grown  enough  to  reach  the  reaction  temperature.  The  plot  of  the  reaction 
progress  variable  profile  (Figure  22)  reveals,  as  expected  fi:om  the  nonreactive  case,  that  reaction  has  occurred 
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Figure  22:  Evolution  of  the  reaction  progress  variable  profile  for  PBXN-109  with 
reaction. 

prior  to  the  onset  of  localization.  In  comparison  with  the  results  from  PBX  9501,  however,  it  is  seen  that 
reaction  occurs  over  a  much  larger  spatial  region.  This  is  due  to  the  fact  that  the  temperature  is  not 
severely  localized  when  it  reaches  the  initiation  temperature.  Computations  were  stopped  after  reaction 
reached  approximately  1.3%  completion,  due  to  the  explosive  growth  in  reaction  and  consequent  decrease 
in  time  step  to  zero.  It  is  thus  concluded  that  shear  localization  is  not  necessary  to  produce  initiation.  A 
mere  inhomogeneity,  if  allowed  to  grow  long  enough,  can  result  in  the  initiation  of  reaction.  For  this  to 
occur,  however,  all  other  forms  of  failure  would  have  to  be  suppressed.  The  nominal  shear  strain  reached  at 
initiation  in  this  simulation,  was  26.7,  far  in  excess  of  the  experimentally  determined  nominal  shear  strain 
of  1.5,  which  was  reached  at  failure.  As  was  the  case  in  the  simulation  on  PBX  9501,  it  is  determined  that 
the  current  analytical  method  is  insufficient  for  modeling  failure  in  PBXN-109. 

5  Conclusions 

The  results  included  herein  first  present  constitutive  behavior  in  the  form  of  shear  stress-shear  strain 
curves  for  a  PBX  cure  cast  simulant,  a  PBX  pressed  simulant,  and  a  melt  cast  simulant  known  as  Filler-E. 
These  simulants  are  used  to  approximate  the  material  properties  of  PBXN  109,  PBX  9501,  and  tritonal. 
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respectively.  Results  from  these  tests  revealed  significant  dependencies  of  the  shear  stress  on  shear  strain 
and  shear  strain  rate,  as  compared  with  the  corresponding  dependencies  of  steels.  Observation  of  the  failure 
surface  of  the  various  explosive  simulants  revealed  evidence  of  tearing,  microvoid  nucleation  and  growth,  crack 
propagation  and  fragmentation.  The  data  from  these  tests  can  be  used  to  calibrate  various  constitutive  laws 
which  are  used  to  input  experimental  data  into  analytical  codes.  The  data  from  the  PBX  cure  cast  and 
pressed  simulants  has  been  used  to  calibrate  a  constitutive  law,  proposed  by  Clifton,  et  al.  (1984),  which 
models  shear  stress  by  including  the  effects  of  strain  and  strain  rate  hardening  as  well  as  thermal  softening. 

Prom  numerical  simulations  on  PBX  9501  without  the  effects  of  reaction,  the  three  stage  localization 
process  observed  experimentally  by  Marchand  and  Duffy  (1988)  was  predicted,  with  the  onset  of  localization 
predicted  after  a  nominal  shear  strain  of  4.63.  In  addition,  the  subsequent  rise  in  temperature  quickly 
exceeding  the  reaction  temperature  of  the  material.  When  the  effects  of  reaction  were  included,  initiation 
of  reaction  began  after  a  nominal  shear  strain  of  6.4.  Prom  simulations  on  PBXN-109  without  the  effects 
of  reaction,  it  was  determined  that  this  material  is  not  susceptible  to  localization.  With  the  inclusion  of 
reaction  effects,  however,  it  was  determined  that  reaction  could  occur  without  localization.  Due  to  a  mere 
growth  in  the  inhomogeneous  temperature  field,  reaction  initiating  after  a  nominal  shear  strain  of  26.7. 

Since  localization  is  assumed  to  be  followed  by  failure  or  initiation,  numerical  results  agreed  with  exper¬ 
imental  results  in  predicting  that  PBX  9501  would  fail  at  a  lower  strain  than  PBXN-109.  Experimentally, 
the  corresponding  simulants  failed  after  nominal  shear  strains  of  0.2  and  1.5,  respectively.  In  comparison  of 
the  experimental  and  numerical  values,  however,  it  is  seen  that  the  experimental  tests  failed  at  significantly 
lower  shear  strains.  This  is  not  surprising,  since  experimental  observations  indicated  that  failure  could  have 
occurred  as  a  result  of  the  combined  mechanisms  of  microvoid  nucleation  and  growth,  crack  propagation 
and  fragmentation.  These  mechanisms  are  not  included  in  the  current  analytical  study  and  hence  it  is  not 
possible  for  their  results  to  be  predicted. 

It  is,  however,  concluded  that  if  these  other  mechanisms  were  suppressed,  localization  and/or  initiation 
would  occur  in  the  tested  materials.  Chou  et  al.  (1991)  stated  that  brittle  materials  become  more  ductile 
under  the  application  of  hydrostatic  stresses.  In  addition.  Prey  (1981)  concluded  that  explosives  under 
compressive  stresses  generate  more  heat  when  being  deformed,  hence  decreasing  the  time  necessary  for 
initiation  to  occur.  Purthermore,  Chou  et  al.  concluded  that  localization  becomes  significant  in  covered 
explosives.  Pinally,  Dodd  and  Atkins  (1983)  concluded  that  increased  hydrostatic  stresses  tended  to  decrease 
microvoid  nucleation.  The  explosives  in  deep  earth  penetrators  are  contained  and  subject  to  an  unknown 
amount  of  hydrostatic  stress.  Deformation  under  these  circumstances  could  thus  result  in  the  suppression 
of  failure  mechanisms  and  hence  increase  the  susceptibility  to  localization  and  initiation.  As  a  result,  it 
is  desired  to  perform  future  tests  under  the  application  of  hydrostatic  stresses,  in  order  to  determine  the 
subsequent  effect  on  localization  and  initiation. 
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It  is  also  concluded  in  Caspar  (1996)  that  the  most  important  parameters  in  the  study  of  shear  localization 
for  a  given  material  are  the  constitutive  parameters.  In  order  to  increase  the  mass,  and  hence  momentum, 
of  reactive  devices,  studies  are  being  performed  at  Eglin  Air  Force  Base  on  an  explosive  unofficially  termed 
TUNG-5.  This  explosive  uses  tungsten  as  a  binder  for  the  explosive  crystals.  Due  to  its  high  strength,  it  is 
concluded  in  this  thesis  that  such  a  material  would  be  particularly  susceptible  to  shear  localization  and  hence 
reaction.  This  material  would  have  the  material  characteristics  of  a  metal  and  the  reactive  characteristics  of 
an  explosive.  Since  it  is  known  that  metals  are  particularly  susceptible  to  localization,  and  since  this  thesis 
has  shown  that  localization  is  quickly  proceeded  by  initiation,  significant  precautions  should  be  taken  in  the 
development  of  munitions  containing  this  material. 

For  future  work,  several  ideas  are  presented  which  may  obtain  more  accurate  results  from  the  TSHB. 
First,  the  incident  and  transmission  bars  will  be  ground  straight.  This  will  reduce  any  bending  and  axial 
pulses,  as  well  as  decrease  inhomogeneous  deformation  in  the  specimen.  It  is  also  desired  to  develop  a 
better  method  for  aligning  the  bars.  Currently,  delrin  bearings  are  used  to  support  the  bars.  With  harder 
bearings,  the  bars  can  be  restrained  from  bending  motion;  in  addition,  it  will  be  easier  to  determine  proper 
alignment  of  the  bars  by  observing  the  amount  of  resistance  when  the  bars  are  rotated  by  hand.  Finally,  since 
preliminary  results  with  new  strain  gages  revealed  up  to  8%  difference  in  transmitted  strain,  it  is  desired  to 
apply  new  strain  gages  to  the  bars.  Finally,  it  is  desired  to  calibrate  the  amplifiers  in  order  to  determine  a 
measure  of  their  accuracy. 
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Abstract 

Although  supercritical  fluids  (SFs)  have  generated  intense  research  interest  within  recent  years,  there  is  still 
much  to  be  learned  about  the  intermolecular  interactions  which  occur  between  a  SF  and  a  dissolved  solute.  We  have 
conducted  a  series  of  experiments  in  both  neat  and  modified  supercritical  fluid  systems  to  determine  the  fundamental 
interactions  that  occur  in  the  local  environment  surrounding  a  solute  molecule.  Specifically,  we  have  used  steady-state 
and  time-resolved  fluorescence  spectroscopy  as  a  tool  to  provide  information  on  solute-fluid  interactions  in  supercritical 
water,  supercritical  alkanes,  supercritical  COj-dilated  polymers,  and  cosolvent-modified  supercritical  CO2.  Solvation 
phenomena  surrounding  an  organic  solute  (pyrene)  in  near  and  supercritical  water  have  been  quantified  using  combined 
steacfy-state  and  time-resolved  fluorescence  measurements.  Pyrene  has  also  been  used  to  determine  molecular-level 
interactions  occurring  surrounding  a  solute  dissolved  in  supercritical  n-alkanes  (supercritical  fiiel  precursors). 
Rotational  reorientation  measurements  have  been  used  to  quantify  the  effect  of  a  supercritical  fluid  on  solute  dynamics 
within  a  model  polymer  system.  Rotational  reorientation  measurements  have  also  been  used  to  determine  the  extent  and 
magnitude  of  solute-fluid  clustering  occurring  in  cosolvent-modified  COj. 


A  MOLECULAR-LEVEL  VIEW  OF  SOLVATION 
IN  SUPERCRITICAL  FLUID  SYSTEMS 

Emily  D.  Niemeyer 

Introduction 

The  ability  to  tune  the  physicochemical  properties  of  supercritical  fluids  (e.g.,  refractive  index  (n),  density  (p), 
and  dielectric  constant  (e))  offers  a  great  advantage  for  many  appUcations  ranging  from  chromatography  and 
extractions*"’  to  processing  and  tynthesis.*"*’  Below  the  fluid  critical  point  (defined  by  its  characteristic  critical 
temperature  and  pressure),  the  liquid  and  gas  phases  can  exist  in  equihbrium.  However,  above  the  critical  point,  the  two 
phases  coalesce  into  a  single  phase  that  exhibits  many  of  the  desirable  properties  of  both  gases  and  liquids  -  a 
supercritical  fluid.  For  example,  supercritical  fluids  possess  favorable  mass  transport  and  increased  solvation,  and  one 
can  adjust  these  parameters  by  very  sUght  changes  in  temperature  and  pressure.'^  Thus,  supercritical  fluids  are  often 
thought  of  as  completely  tunable  solvents. 

Supercritical  fluids  have  been  used  industrially  in  a  variety  of  fields  ranging  from  hazardous  waste  disposal’-  *’" 
“  to  polymer  processing.’  Because  many  supercritical  fluids  are  environmentally  benign  (e.g.,  COj  and  HjO),  they  have 
also  become  the  focus  of  intensive  research  aimed  at  replacing  hazardous  and/or  expensive  organic  solvents  currently  in 
use.  In  the  food  and  beverage  industry,  supercritical  COj  has  replaced  organic  solvents  for  everythmg  from 
decaffeination  of  coffee  to  the  removal  of  flavor  essences  from  spices.’-’  Supercritical  COj  has  also  been  used  by  the 
petroleum  industry  for  removal  of  contaminants  from  coal  and  the  synthesis  and  processing  of  commoiJy  used 
fluoropolymers.’  More  recently,  significant  interest  has  focused  on  supercritical  water  oxidation  as  a  methodology  for 
the  disposal  of  hazardous  waste  and  chemical  arms.*’’'* 

It  has  been  well-established  both  theoretically’*-”  and  experimentally”  ”  that  an  increase  in  fluid  density 
surrounding  a  solute  occurs  in  the  proximity  of  the  fluid  critical  point.  This  increased  interaction  is  often  termed  “solute- 
solvent  clustering””  or  “molecular  charisma”.’^  These  solute-fluid  clusters  are  dynamic  in  nature,  constantly  exchanging 
fluid  molecules  with  the  bulk  on  a  short  time  scale.’*  In  addition,  the  maxima  in  clustering  has  recently  been  determined 
to  occur  at  near  one  half  of  the  critical  density,  much  lower  than  previously  thought.”  This  clustering  phenomena  is 
known  to  affect  reaction  rates  and  outcome,*-’  solute  conformational  equilibria,’*  and  extraction  processes.’  Therefore,  in 
order  to  fully  exploit  the  potential  of  supercritical  fluids,  one  must  develop  a  molecular-level  view  of  solvation  in 
supercritical  solvents  like  CO2  and  H2O. 

A  portion  of  the  Air  Force  mission  has  aimed  toward  the  development  of  newer,  high-speed  aircraft.  These 
advanced  aircraft  will  rely  heavily  on  the  onboard  fuel  supply  as  the  major  means  to  cool  the  plane  fuselage.  As  a  result, 
at  any  given  time,  portions  of  the  fuel  may  be  raised  above  its  critical  temperature.  Therefore,  questions  about  the  fuel 
stability,  the  internal  fiiel  dynamics  and  interactions  with  dissolved  matrix  concomitants  (additives),  and  the  exchange 
properties  under  supercritical  conditions  represent  key  factors  that  will  govern  the  ultimate  performance  of  the  ftiels 
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and,  hence,  these  advanced,  high-speed  aircraft.  For  example,  if  enhanced  solute-fluid,  solute-cosolvent,  and  solute- 
solute  interactions  occur  and  persist  in  or  near  the  fuel  critical  point,  it  is  entirely  possible  for  the  performance  of  a  given 
fiiel  to  plummet  and  even  for  the  fiiel  delivery  system  to  fail  entirely.  For  this  reason,  it  is  imperative  to  develop  a  more 
comprehensive  molecular-level  understanding  of  the  nature  of  the  interactions  occurring  within  supercritical  fluid  and 
supercritical  fiiel  systems. 

Toward  this  end,  we  have  conducted  several  studies  to  determine  solute-fluid  interactions  occurring  in  neat  and 
modified  supercritical  fluid  systems.  Steady-state  and  time-resolved  fluorescence  measurements  have  been  used  to 
experimentally  quantify  solvation  phenomena  occurring  in  near  and  supercritical  water  using  the  fluorescent  probe 
pyrene  {Appl.  Spectrosc.,  submitted  for  publication).  We  have  also  determined  the  extent  of  solute  fluid  clustering  for 
pyrene  dissolved  in  supercritical  n-alkanes  (fuel  precursors).  Rotational  reorientation  measurements  have  been  used  to 
quantify  the  effect  of  a  supercritical  fluid  on  solute  cfynamics  within  a  model  polymer  system  (manuscript  in  preparation). 
Rotational  reorientation  measurements  have  also  been  used  to  determine  the  extent  and  magnitude  of  solute-fluid 
clustering  in  cosolvent-modified  COj.  The  remainder  of  this  document  expands  on  each  of  these  research  areas. 


Probing  Molecular-Level  Interactions  Occurring  in  SCW 

Supercritical  water  (SCW)  has  received  much  attention  in  recent  years  because  it  serves  a  key  roll  in  hazardous 
waste  disposal  and  chemical  arms  destruction.  For  example,  environmentally  harmful  organics  can  be  rapidly  and 
efficiently  oxidized  in  SCW  to  benign  compounds  such  as  COj,  HjO,  and  simple  inorganic  salts  and  acids. 

Recently,  the  first  commercial  hazardous  waste  disposal  facility  based  on  SCW  went  on  line  in  Austin,  TX,  proving  that 
SCW  oxidation  is  suitable  for  large  scale  processing.”-'*  Supercritical  water  oxidation  plants  have  also  been 
constructed,  as  an  alternative  to  incineration  of  municipal  sludge  and  industrial  wastes,  in  Germany,  Canada  and  Japan.'® 
Although  water  is  likely  the  most  well-studied  chemical  solvent,  when  it  is  raised  above  its  critical  point  (T^  = 
374.4  °C,  Pj  =  220.55  bar,  and  =  0.281  g/mL),  it  becomes  a  solvent  with  fascinating  properties.”-’®-’’  Specifically, 
liquid  water  is  clearly  a  polar  solvent  that  dissolves  well  ionic  species  and  dissolves  less  well  hydrophobic  solutes. 

SCW,  in  contrast,  behaves  more  like  a  “nonaqueous”  solvent,  becoming  completely  miscible  with  nonpolar  compounds 
that  are  relatively  insoluble  in  liquid  water  at  ambient  conditions.”  Like  other  supercritical  fluids,  SCW  also  has 
physicochemical  properties  that  can  be  tuned  between  gas  and  liquid-like  values,  making  it  an  attractive  medium  for  new 
reactions  and  waste  disposal.’®-”  Water  above  its  critical  point  can  also  become  extremely  corrosive  and  even  dissolves 
common  materials  such  as  stainless  steel  and  quartz.”-” 

Although  recent  interest  in  SCW  processing  has  soared,  research  on  the  fundamental  interactions  that  occur  in 
SCW  is  lagging.  For  example,  there  have  been  many  studies  aimed  toward  understanding  the  kinetics  and  mechanism  of 
basic  reactions  in  scW,”"”-”  ’*  but  there  have  been  relatively  few  experiments  aimed  at  quantifying  the  molecular-level 
interactions  between  the  solute  and  the  fluid.  Molecular  dynamics  calculations^®  have  been  used  to  model  SCW  and 
supercritical  aqueous  solutions.  These  results  show  that  SCW  can  maintain  a  solvation  shell  similar  to  that  seen  in 
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ambient  liquid  water,  with  long  range  solvation  phenomena  similar  to  those  observed  in  other  supercritical  fluids. 
Interestingly,  although  solute-fluid  clustering  has  been  reported  using  molecular  simulations  in  SCW,^  the  extent  of 
density  augmentation  surrounding  an  orgamc  solute  has  not  been  flilly  quantified  experimentally.  Also,  although  solute- 
fluid  clustering  is  common  in  other  fluid  systems,*l'“  recent  molecular  dynamics  work  by  Gao^‘  suggests  solute-fluid 
rarefication  (i.e.,  a  decrease  in  the  local  density  of  solvent  molecules  surrounding  the  solute)  for  the  benzene  dimer  in 

sew. 

We  have  aimed  to  quantify  experimentally  the  local  density  surrounding  a  model  organic  solute  dissolved  in 
sew.  Toward  this  end,  we  use  steady-state  and  time-resolved  fluorescence  to  provide  a  molecular-level  view  of  the 
nature  of  solvation  occurring  in  Se  W  and  to  compare  these  interactions  with  those  observed  in  other  supercritical  fluids. 

Many  of  the  reasons  for  the  lack  of  spectroscopic  data  in  SCW  are  associated  with  the  system’s  high 
temperature  and  pressure.  Pyrene  fluorescence  has  been  studied  in  water  to  near-critical  temperatures  and  these  reports 
state  that  pyrene  is  stable  for  up  to  2  hrs  at  345  bar.^^  For  this  reason  and  others  (vide  infra),  we  have  chosen  pyrene  as 
our  model  organic  solute.  Fluorescence  fi-om  pyrene  has  been  used  to  probe  local  environments  in  a  variety  of  media,^^' 
including  supercritical  fluids^^^’“  and  its  photophysics  are  well  known.  For  example,  the  intensity  of  the  0-0 
transition  (I,)  is  solvent  dependent  while  the  0-3  transition  (Ij)  is  solvent  insensitive.  Thus,  the  Ij/Ij  ratio  provides  a 
means  to  quantify  the  local  environment  surrounding  the  pyrene  molecule."  This  approach  has  been  used  previously  in 
supercritical  CO2  to  determine  the  degree  of  local  density  augmentation  over  a  broad  density  region."-"  However, 
because  of  the  nature  of  SCW,  these  measurements  become  significantly  more  challenging  because  of  temperature- 
induced  broadening  of  the  pyrene  emission  spectrum.  Finally,  one  must  carefully  deoxygenate  fully  the  pyrene/water 
solutions  to  minimize  solute  oxidation. 

Experimental 

Reagents  and  Sample  Preparation 

Pyrene  (99.9%)  was  purchased  fi-om  Aldrich  and  used  as  received.  Deionized  (18  MQ)  ultra  filtered  water  was 
used  without  fiirther  purification.  Samples  of  pyrene  in  water  were  prepared  by  stirring  a  saturated  solution  of  pyrene 
for  several  days  and  filtering  prior  to  use  with  a  fiitted  funnel.  This  results  in  a  solution  with  a  concentration  of  pyrene 
which  is  approximately  0.5  pM.“  There  were  not  indications  of  aggregate  or  ground-state  pyrene  dimerization  under 
these  conditions. 

Sample  Deoxvgenation 

Initially,  pyrene  solutions  were  purged  with  Nj  for  ca.  45  min  prior  to  beginning  the  experiment. 

Unfortunately,  possible  pyrene  decomposition  was  evidenced  by  a  broad,  red-shifted  emission  occurring  at  high 
temperatures.  Further,  on  cooling  these  samples  back  to  ambient  conditions,  we  were  unable  to  recover  an  emission 
spectrum  that  resembled  pyrene  in  water  prior  to  being  subjected  to  supercritical  conditions.  In  all  subsequent 
experiments,  the  pyrene/water  solutions  were  purged  with  Ar  gas  for  approximately  45  min  and  were  then  subjected  to 
multiple  fi-eeze-pump-thaw  (FPT)  cycles  to  remove  all  oxygen  fi-om  the  sample.  The  Ar/FPT  technique  resulted  in  no 
detectable  decomposition  of  the  pyrene  at  supercritical  temperatures  and  recovery  of  a  well-resolved  pyrene  emission 
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spectrum  upon  return  of  the  system  to  ambient  conditions. 
Instrumentation 


A  simplified  schematic  of  our  titanium  high-pressure  optical  cell  is  shown  in  Figure  1 .  This  cell  is  a  modified 
version  of  a  system  used  by  the  Brill  group  at  the  University  of  Delaware.’^’  The  cell  is  comprised  of  a  titanium  body 
(TB)  which  is  coned  and  threaded  at  top  and  bottom  to  accept  HiP  (Erie.  PA)  high-pressure  fittings  (HPF).  Sapphire 
windows  (SW)  (Insaco.  Quakertown,  PA)  are  sealed  into  each  face  of  the  cell  using  24  K  gold  washers  (GW)  and 
titanium  flanges  (TF).  This  particular  design  allows  the  sapphire  windows  to  be  easily  removed  for  cleaning.  The  same 
basic  scheme  has  also  been  used  to  make  cells  with  windows  in  90“  (not  shown)  and  180“  (Figure  1)  geometries  for 
fluorescence  and  absorbance  measurements,  respectively.  These  cells  have  been  tested  to  and  routinely  used  for  many 
hrs  at  1 0,000  psia  and  a  maximum  temperature  of  399  “C.  These  particular  limits  are  set  by  the  current  pump  and  oven 
systems.  Separate  polyimide-coated  fiber  optics  (FO)  (CeramOptec,  East  Longmeadow,  MA)  are  held  against  the 
sapphire  optical  windows  using  a  special  mounting  flange  that  was  developed  in-house.  This  flange  holds  the  optical 
fiber  securely  against  the  sapphire  window  face  without  breakage  over  the  entire  temperature  range  studied.  The 
manufacturer  specifications  claim  that  these  polyimide-coated  fiber  optics  can  routinely  be  operated  up  to  400“C. 

For  steady-state  fluorescence  measurements,  the  titanium  high-pressure  cell  is  incorporated  into  the 
experimental  apparatus  shown  in  Figure  2.  A  high-pressure  syringe  pump  (P)  (Isco,  Model  SFC-500)  capable  of 
producing  up  to  10,000  psia  is  operated  in  the  constant-pressure  mode  to  supply  continuously  oxygen-fi-ee  pyrene/water 
solutions  through  1/16"  stainless  steel  tubing  (with  a  preheater  coil  (PC))  to  the  high-pressure  titanium  cell  (TC).  The 
entire  cell  is  located  within  a  temperature-controlled  GC  oven  (Hewlett  Packard,  Model  5730A).  A  valve  and  flow 
restrictor  (FR)  assembly  are  located  outside  the  GC  oven  and  each  is  adjusted  during  an  experiment  to  maintain  a 
constant  solution  flow  300  pL/min)  through  the  cell.  This  flow  through  approach  is  used  to:  (1)  ensure  that  the  fluid 
viewed  within  the  cell  is  homogenous  and  supercritical  and  (2)  minimizes  the  actual  residence  time  of  the  solute  at 
supercritical  conditions.  The  system  pressure  is  constantly  monitored  (0.03%  accuracy)  using  a  pressure  transducer  (PT) 
(Omega,  Stamford,  CT).  The  temperature  is  adjusted  by  the  oven  regulator  and  monitored  using  an  insulated 
thermocouple  (TH)  (Simpson  Accessories,  Elgin,  IL)  located  close  to  the  Ti  high-pressure  cell  within  the  oven.  A  He- 
Cd  laser  (HCL)  (Ommchrome,  Model  3074-20M)  (325  nm)  is  used  for  excitation  and  an  interference  filter  (10  nm 
FWHM,  Oriel)  is  used  to  remove  any  extraneous  plasma  discharge  fi-om  reaching  the  detection  electronics.  The  laser 
beam  is  focused  onto  the  proximal  end  of  an  optical  fiber  (FO),  using  a  fused-silica  lens  (L)  and  XYZ  translator  (T),  and 
the  resulting  fluorescence  fi-om  the  sample  is  collected  using  a  second  optical  fiber  whose  output  is  collected  and  focused 
by  a  lens  onto  the  entrance  slit  of  an  emission  monochromator  (M)  (band  pass  =  2  nm).  After  proper  wavelength 
selection,  the  signal  is  detected  by  a  photomultiplier  tube  (D)  and  sent  to  a  personal  computer  (PC)  for  processing.  The 
remainder  of  the  spectrofluorometer  (SLM-AMINCO  48000  MHF)  is  configured  in  the  standard  ratiometric  mode. 
Measurement  of  the  p3Tene  I,  and  I,  band  intensities  is  made  using  software  provided  with  the  fluorometer. 

The  basic  setup  for  our  time-resolved  fluorescence  measurements  is  shown  in  Figure  3.  A  Nj  laser  (ML)  (337 
nm,  LSI  Incorporated,  Model  LS  337)  operating  at  20  Hz  (with  a  pulse  width  of  approximately  3  ns)  is  used  for 
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excitation.  A  fused  silica  beam  splitter  (BS)  serves  to  send  a  small  portion  of  the  excitation  to  a  photodiode  (PD)  and  the 
current  pulse  from  the  photodiode  triggers  a  digital  sampling  oscilloscope  (OSC)  (Tektronics,  Model  TDS  350).  The 
remainder  of  the  excitation  beam  is  focused  onto  the  proximal  end  of  an  optical  fiber,  using  a  lens  (L)  and  XYZ 
translator  (T),  and  the  resulting  fluorescence  emission  from  the  sample  is  collected  by  a  second  optical  fiber.  A  lens  is 
used  to  collect  and  focus  the  emission  through  a  bandpass  filter  (BPF)  for  wavelength  selection  onto  the  photocathode  of 
a  photomultipUer  tube  (PMT).  The  PMT  dynbde  circuitry  is  designed  for  fast  response  and  has  been  described  in  detail 
previously.*^  The  pulsed  output  from  the  PMT  is  directed  to  the  digital  oscilloscope  and  commercial  software  provided 
with  the  oscilloscope  is  used  to  carry  out  all  data  acquisition.  The  excited-state  fluorescence  lifetime  was  determined 
from  a  linearized,  logarithmic  plot  of  the  excited-state  decay  trace  as  described  elsewhere.**’*® 

A  large  range  of  temperatures  (from  ambient  to  supercritical)  and  reduced  densities  were  studied  in  this  work. 
Special  emphasis  was  placed  on  the  reduced  density  (p,)  range  between  0.5- 1 .0  (where  p,  =  where  a  maxima  in 

solvent-solute  interaction  is  known  to  occur  in  other  supercritical  fluid  systems.'*®  Experiments  were  not  conducted 
below  p,  =  0.5  because  of  poor  signal-to-noise  (S/N).  The  density  and  dielectric  constant  of  water  as  a  function  of 
temperature  and  pressure  were  calculated  using  a  simple,  complete  equation  of  state  as  described  by  Pitzer  and 
Sterner.*’  All  refractive  index  terms  were  calculated  using  the  Clausius-Massotti  equation  and  the  appropriate  molar 
refractivity  and  density.*® 

Results  and  Discussion 

Steady-State  Fluorescence  Emission  Studies 

Figure  4  presents  typical  steady-state  fluorescence  spectra  for  pyrene  in  water  at  several  temperatures  using  the 
apparatus  described  in  Figure  2.  For  the  low  temperature  spectrum,  both  I,  and  I,  are  well-resolved  and  easily 
distinguishable,  showing  that  our  fiber-optic  setup  provides  adequate  S/N  and  spectral  resolution  to  measure  the  pyrene 
spectral  features.  Once  we  determined  that  our  experimental  apparatus  was  able  to  easily  measure  the  pyrene  emission 
spectrum,  we  began  a  systematic  study,  over  a  large  range  of  temperatures  and  pressures,  to  determine  the  degree  of 
thermal  broadening  and  the  extent  of  such  on  Ii/Ij.  As  the  temperature  is  raised  (Figure  4),  spectral  broadening  of  the 
pyrene  emission  occurs  to  the  point  that  I,  and  Ij  become  difiicult  to  distinguish  visually. 

I,  and  I3  values  are  commonly  measured  by  determining  the  intensity  of  the  respective  vibronic  bands.**  In 
hquid  water  at  ambient  conditions  (25  °C),  the  I,  band  occurs  at  376  nm  while  the  I3  band  occurs  at  383  nm  (Figure  4). 
However,  as  temperature  increases,  the  Ij  and  I3  bands  clearly  red  shift  and  the  entire  spectral  envelope  broadens  (Figure 
4).  Previous  studies  of  pyrene  in  near-critical  water**  measured  I,  and  I3  at  376  and  383  nm,  respeetively;  however,  our 

results  clearly  illustrate  that  I,  and  I3  bands  actually  red  shift  as  the  temperature  increases.  Therefore,  it  seems  less  than 

ideal  to  measure  I,  and  I3  at  a  constant  wavelength  under  supercritical  conditions  in  SCW.  In  order  to  circumvent  these 
problems,  we  have  carefully  followed  the  spectral  position  of  I,  and  I3  (using  the  first  derivative  of  the  spectrum  if 
needed)  and  calculate  I,/l3  using  the  actual  band  maxima. 

Figure  5  presents  the  recovered  I,/t3  ratios,  determined  as  previously  described,  for  pyrene  in  water  as  a 
funetion  of  reduced  density  from  26.6  to  398.8  “C.  Over  this  temperature  range  one  can  see  that  the  I1/I3  ratio  is  strongly 
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influenced  by  temperature,  changing  approximately  300%  before  “leveling  off”  below  a  reduced  density  of  about  two.  It 
is  well-known  that  temperature  strongly  influences  I,/rj  in  polar  liquids  and  supercritical  but  measurements 

have  not,  to  our  knowledge,  been  made  to  the  high  temperatures  explored  in  the  current  work.  Therefore,  our 
experimental  data  confirms  the  decrease  in  Ij/Ij  with  increasing  temperature  as  previously  observed,  but  over  a  much 
higher  temperature  range.  Our  results  also  suggest  that  this  temperature-induced  decrease  in  I, /I,  for  pyrene  in  water 
does  not  affect  the  data  collected  above  about  280  “C. 

To  provide  quantitative  information  on  the  local  environment  surrounding  pyrene  in  SOW,  we  require  a  link 
between  the  experimental  measurables  (1,/Ij)  and  some  physicochemical  property  of  the  solvent.  Fortunately,  pyrene  has 
been  used  previously  to  determine  the  extent  of  local  density  augmentation  in  supercritical  For  pyrene  in 

normal  hquid  solvents,  Ij/Ij  is  linear  with  the  well-known  Jt*  polarity  scale,  where  o  and  i  are  constants;'**’^’’’® 

/j/Zj  =  a  +  bn* 

The  7t*  term,  in  turn,  has  been  shown  to  depend  linearly  on  the  dielectric  cross  term,  f(e,  n^)  where  c  and  are 
constants,  €  is  the  solvent  dielectric  constant,  and  n  is  the  solvent  refractive  index:^®’^’ 

71*  =  c  +  df{e,  n^)  (2) 

/€,  w2)  =  [(e-l)/(2e+l)][(«2-i)/(2„2+i)] 

Therefore,  the  relationship  between  I, /I,  and  the  dielectric  cross  term  should  be  linear,  in  the  absence  of  any  local  density 
effects:^®-'*^ 


h'h  =  ^  n^) 


(4) 


where  A  is  the  vapor  phase  I./Ij  value  for  pyrene  (0.4 1 and  B,  the  slope,  is  determined  by  the  pyrene  Ii/lj  at  high 
density,  liquid-like  fluid  values. 

Figure  6  presents  our  data  as  a  function  of  f(€,  n^)  at  379.9  "C.  The  solid  and  dashed  lines  are  the  values 

predicted  if  there  were  no  solute-fluid  clustering.  The  line  labeled  “T„„^p”  is  derived  by  using  the  Ij/Ij  at  the  highest 
hquid  densities  without  any  compensation  for  the  well-known  decrease  in  Ij/lj  with  temperature. The  dashed 
Zcomp  attempts  to  take  into  better  account  the  decrease  in  Ij/Ij  with  increasing  temperature  by  using  high 
temperature,  high  density  Ij/Ij  values.  The  upward  deviation  of  the  experimentally  determined  Ij/Ij  with  respect  to  the 
Zcomp  is  indicative  of  solute-fluid  clustering^*’^^  between  pyrene  and  SC  W.  Using  the  differences  between  the 
experimental  I1/I3  values  and  the  predicted  Ij/Ij  values,  we  can  estimate  the  local  water  density  immediately  surrounding 
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the  average  pyrene  molecule  as  a  fimction  of  bulk  fluid  density. By  dividing  this  local  term  by  the  bulk  SCW  density 
(PiocyPbdk)  we  can  determine  the  extent  of  local  density  augmentation  (clustering)  surrounding  the  pyrene. 

Figure  7  illustrates  the  PkH»/Pb«ik  “  function  of  reduced  density  for  pyrene  in  SCW  at  379.9, 384.4,  and  398.8 
“C.  These  results  show  several  interesting  points.  First,  the  largest  degree  of  density  augmentation  occurs  well  below 
the  critical  density.  Second,  the  degree  of  density  augmentation  is  on  the  order  of 500%  at  a  reduced  density  near  0.5. 
Finally  we  See  that  as  the  temperature  is  increased,  there  is  an  ^parent  decrease  in  the  degree  of  fluid  density 
angmpnfafinn  occurring  STOund  the  pyrene  molecule.  This  decrease  in  solute-fluid  interactions  with  increasing 
temperature  has  been  previously  shown  to  occur  in  other,  more  mild  supercritical  fluid  systems  as  well.^*’*“ 

It  is  known,  from  work  on  other  supercritical  fluids,  that  a  maximum  in  solute-fluid  clustering  occurs  at 
approximately  one-half  the  critical  density.'^-^’  Although  it  is  difficult  to  make  measurements  at  these  low  fluid  densities 
in  SCW,  we  also  observe  data  fiilly  consistent  with  a  maximum  in  clustering  occurring  in  this  region  for  pyrene  in  SCW. 
Thus,  there  is  no  evidence  for  solute-fluid  rarefication  in  this  system.  In  supercritical  COj  the  typical,  maximal  local 
density  mtinneCTnent  is  on  the  order  of 230%  for  pyrene.^  Clearly,  we  observe  a  much  greater  degree  of  local  density 
pnliHnr.pmCTt  in  SCW.  This  is  likely  due  in  part  to  the  ability  of  water  to  hydrogen  bond  more  with  itself  and/or  due  to 
the  more  polarizable  nature  of  water  relative  to  COj. 

Time-Resolved  Fluorescence  Studies 

Time-resolved  fluorescence  allows  one  to  access  processes  that  occur  on  a  time  scale  similar  to  the  excited- 
state  fluorescence  lifetime.”-’*  Figure  8  presents  a  typical  series  of  time-resolved  fluorescence  decay  traces  for  pyrene  in 
SCW  at  several  temperatures  and  pressures.  Each  of  these  fluorescence  decays,  within  the  current  time  resolution  of  the 
apparatus  (5-6  ns),  is  well-described  by  a  monoexponential  decay.  Fluorescence  lifetimes  were  not  measured  in  the  low 
density  region  because  of  inadequate  S/N.  Figure  9  shows  that  as  temperature  is  increased,  we  see  a  systematic  decrease 
in  the  pyrene  lifetime  in  SCW.  This  decrease  correlates  well  with  the  I, /I,  data  (Figure  5).  Analysis  of  these  data  in 
terms  of  an  Arrhenius  relationship,  allows  one  to  determine  if  the  decrease  in  fluorescence  lifetime  is  due  to  a  change  in 
solvation  and  relaxation  pathways  or  if  the  decrease  is  simply  a  thermally-activated  process.  Examination  of  the 
Arrhenius  plot  (Figure  9,  inset)  shows  that  the  systematic  decrease  in  lifetime  is  exponentially  activated. 

Inspection  of  the  literature  shows  that  this  same  type  of  phenomena  has  been  previously  observed  for  pyrene  in 
ethanol  and  liquid  paraffin  between  -100  to  140  ”C.“  The  activation  energy  for  this  process  in  ethanol  and  paraffin  were 
determined  to  be  9.49  and  12.97  kJ/mol,  respectively.  For  the  current  work,  the  recovered  activation  energy  in  water 
was  determined  to  be  4.07  ±  0. 1  kJ/mol.  Although  differences  in  these  activation  energies  can  not  be  easily  explained,  it 
is  interesting  to  note  that  the  activation  energy  for  the  process  decreases  as  the  solvent  polarity  increases.  To  determine 
the  origin  of  the  decrease  in  fluorescence  lifetime  with  increasing  temperature,  it  is  useful  to  compare  the  recovered 
activation  energies  with  the  activation  energy  predicted  for  simple  viscous  flow  conditions  (E,,(ethanol)  ~  3  kJ/mol;** 
E„(paraffm)  ~  50  kJ/mol;*'  E„(H20)  =15.5  kJ/mol“).  Because  the  recovered  activation  energy  for  the  decrease  in 
pyrene  fluorescence  lifetime  is  different  than  the  activation  energy  for  viscous  flow,  it  can  be  concluded  that  this 
decrease  is  not  due  to  quenching  of  the  pyrene  lifetime  by  dififiision  of  an  unknown  species.®' 


32-9 


In  order  to  explain  this  decrease  in  lifetime  with  increasing  temperature,  it  is  useful  to  examine  the  electronic 
states  of  pyrene  more  closely.  Specifically,  pyrene  is  known  to  have  a  triplet  state  (Tj)  that  lies  only  300  cm’'  above  the 
first  excited  singlet  state  (S,).“  Under  normal  conditions,  intersystem  crossing  to  this  triplet  state  does  not  occur  due  to 
a  large  activation  energy  between  these  states,  causing  the  crossing  to  be  energetically  unfavorable.**’*’  Further,  if  the 
rate  of  vibrational  relaxation  is  fast  compared  to  the  rate  of  intersystem  crossing  (k^c)  between  S,  and  Tj,  the  rate  of 
intersystem  crossing  can  be  defined  by:** 

V  =  (s, 


where  D  is  the  density  of  the  final  states,  J  is  the  electronic  transition  matrix  element,  and  F(E)  is  the  Franck-Condon 
factor  summed  over  all  states. 

As  the  ^stem  temperature  is  increased,  there  is  a  concomitant  increase  in  the  population  of  pyrene  molecules 
within  the  upper  vibrational  levels  (denoted  v)  of  the  ground-state  manifold  (S„).  On  excitation,  these  molecules,  within 
the  upper  vibrational  levels  of  So,  become  promoted  into  the  S,  manifold.  By  applying  the  Boltzmann  distribution,  one 
can  write  the  intersystem  crossing  rate  as.“ 
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where  ( vj  and  £(" v)  are  the  rate  of  intersystem  crossing  and  excess  vibrational  energy,  respectively,  at  a  particular 

vibrational  level  within  the  S,  manifold.  Therefore,  as  the  upper  vibrational  level  “gateways”  within  S,  become 
populated,  the  rate  of  intersystem  crossing  to  the  triplet  state  becomes  thermally  activated,  causing  an  increase  of 
intersystem  crossing  to  the  triplet  state.  This  leads  to  an  increased  nonradiative  pathway,  and  subsequent  decrease  in  the 
excited-state  pyrene  fluorescence  lifetime  in  water  at  higher  temperatures.  Thus,  although  solute-fluid  clustering  is 
substantial  within  the  pyrene/SCW  system,  it  does  not  appear  to  affect  the  pyrene  emissive  rates  beyond  those  seen  due 
to  thermal  activation. 

Conclu.sions 

Although  significant  spectral  broadening  occurs  in  the  pyrene  emission  spectra  with  increasing  temperature,  we 
were  able  to  estimate  I,/!,  ratios  as  a  function  of  reduced  density  over  a  wide  range  of  temperatures.  Ij/Ij  is  seen  to 
systematically  decrease  (approximately  300%)  with  increasing  temperature  before  leveling  off  above  -280  “C.  This 
phenomenon  has  little  to  do  with  solvation  per  se,  but  is  a  result  of  temperature-induced  changes  in  the  pyrene 
emission.*^’^*'**  Deviation  of  the  experimentally  determined  I, /I3  values  fi-om  the  predicted  values  above  the  critical 
temperature  allows  us  to  estimate  the  extent  of  local  density  augmentation  suirounding  pyrene  dissolved  in  SCW.  These 


32-10 


results  show  that  near  one  half  the  fluid  critical  density,  the  local  density  augmentation  is  on  the  order  of  five  times 
greater  than  the  bulk  water  density.  This  degree  of  solute-fluid  clustering  decreases  as  the  system  temperature  and 
pressure  are  increased.  In  fact,  at  a  reduced  density  of  1 .2  -  1.3,  the  local  and  bulk  SCW  densities  appear  comparable 
suiToimding  pyrene. 

Time-resolved  fluorescence  measurements  of  pyrene  in  water  show  that  as  the  temperature  of  the  system  is 
inrrpg«wt  a  subsequent  decrease  in  the  pyrene  fluorescence  lifetime  occurs.  This  decrease  is  shown  to  be  exponentially 
activated  and  is  a  manifestation  of  thermally  populating  upper  vibrational  levels  within  So  and  in  turn  S,  that  open  up 
promoter  modes  to  a  nearby  triplet  state  (Tj).  Proximity  to  the  solvent  critical  point  does  not  apparently  affect  the  pyrene 
fluorescence  decay  kinetics.  Thus,  while  solute-fluid  clustering  is  clearly  evident,  it  does  not  apparently  influence  the 
pyrene  photophysics. 

Pmhinff  Tntermolccular  Interactions  in  Supercritical  Alkanes;  Mock  Supercritical  Aviation  Fuels. 

With  the  eventual  introduction  of  advanced  high-speed  aircraft,  the  onboard  fuel  supply  will  likely  be  called 
upon  to  cool  the  plane  fuselage.  In  some  instances,  the  circulating  fiiel  may  be  heated  such  that  it  becomes  a 
supercritical  fluid  (vide  supra).  It  is  therefore  important  to  question  how  the  environment  surrounding  a  dissolved 
solute/additive  may  change  in  an  aviation  fuel  raised  above  its  critical  point  as  a  fiinction  of  fiiel  density,  temperature, 
and  pressure.  Clearly,  such  will  govern  not  only  fuel  performance  and  lifetime,  but  also  the  design  and  overall  lifetime 
of  aircraft  components. 

As  a  step  toward  addressing  this  issue,  we  have  used  static  fluorescence  spectroscopy  to  determine  the  effects 
of  supercritical  temperatures  on  several  simple  alkane  fuel  precursors  (i.e.,  w-pentane,  n-hexane  and  n-heptane). 
Fluorescence  spectroscopy  provides  an  ideal  tool  to  quantify  molecular-level  interactions  occurring  in  supercritical 
fluids.®  Due  to  the  inherently  low  detection  limits  of  fluorescence  measurements,’®  it  is  possible  to  work  at  "infinite 
dilution,"  thus  minimizing  any  solute-solute  interactions.  By  using  a  solute/probe  that  is  sensitive  to  its  local 
environment,  we  are  able  to  access  directly  the  local  environment  surrounding  the  probe. 

We  have  used  the  fluorescent  additive/probe  pyrene  to  quantify  local  intermolecular  interactions  and  to 
determine  how  these  interactions  may  differ  fi-ora  the  bulk  fluid  properties.  Pyrene  is  an  ideal  solute  because  its 
photophysics  are  well-known  and  it  has  been  thoroughly  studied  in  a  variety  of  media,®  ’”  including  other  supercritical 
fluids.'**'’®  However,  there  are  many  special  experimental  considerations  that  must  be  taken  into  account  when  making 
these  measurements  in  supercritical  alkanes.  Due  to  the  high  temperature  needed  to  generate  a  supercritical  alkane  (T, 
(pentane)  =  1 96.6  “C;  T.(hexane)  =  234.4  "C;  T,(heptane)  =  267.30  'C),  the  solute  must  be  stable  at  harsh  temperatures 
with  minimal  spectral  broadening.  Work  in  this  laboratory  (vide  supra)  has  shown  that  pyrene  is  stable  up  to  400  "C  in 
supercritical  water  vnth  measurable  Ij/Ij  ratios,*®  and  is  therefore  an  ideal  probe  for  the  supercritical  alkane  work. 
Deoxygenation  is  also  imperative  for  these  experiments,  not  only  for  the  stability  of  the  solute,  but  also  to  avoid 
combustion  of  the  alkanes  at  high  temperatures  and  pressures. 
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Experimental 

Pyrene  (99.9%)  was  purchased  from  Aldrich  and  used  as  received.  Spectrophotometric  grade  (99%)  n- 
pentane,  n-hexane  and  n-heptane  were  purchased  from  Aldrich  and  used  without  further  purification.  Solutions  of 
pyrene  in  each  of  the  respective  alkanes  were  prepared  by  first  pipetting  the  appropriate  amount  of  pyrene/ethanol  stock 
solution  into  a  vessel.  The  ethanol  was  then  evaporated  off,  and  alkane  was  added  to  the  vessel  to  make  a  10  pM 
solution.  All  pyrene/n-alkane  solutions  are  deoxygenated  by  purging  with  Ar  for  approximately  1  hr  prior  to  the 
experiment. 

Steady-state  fluorescence  measurements  are  made  using  the  titanium  high-pressure  cell  and  fiber  optic  setup 
described  previously  (yide  suprd).*^  Deoxygenated  pyrene/alkane  solution  is  continuously  flowed  through  the  cell  using 
a  flow  restrictor  assembly  and  high-pressure  syringe  pump  operated  in  constant  pressure  mode  (yide  supra).  A  He-Cd 
laser  (Omnichrome,  Model  3074-20M)  (325  nm)  is  used  for  excitation  and  an  interference  filter  (10  nm  FWHM,  Oriel) 
is  used  to  remove  any  extraneous  plasma  discharge  from  reaching  the  detection  electronics.  All  measurements  are  made 
using  the  SLM-AMINCO  48000  MHF  spectrofluorometer  configured  in  the  standard  ratiometric  mode.  Measurement 
of  the  pyrene  Ij  and  I3  band  intensities  is  made  using  software  provided  with  the  fluorometer. 

A  large  range  of  reduced  densities  at  a  reduced  temperature  (T,)  of  1 .01  (where  T,  =  T,^,;  = 

experimental  temperature;  T,  =  critical  temperature)  were  studied  for  each  of  the  n-alkanes  with  emphasis  placed  on  the 
reduced  density  range  between  0.5-1 .0  where  maximum  solvent-solute  interactions  are  known  to  occur  in  other 
supercritical  fluid  systems.^*  The  density  of  each  of  the  alkanes  was  estimated  as  a  function  of  temperature  and  pressure 
using  a  commercial  software  package  (SFSolver,  Isco,  Inc).  All  refractive  index  and  dielectric  constant  terms  were 
calculated  using  the  Clausius-Massotti  equation  and  the  appropriate  molar  refractivity  and  density.** 

Results  and  Discussion 

The  pyrene  Ij/Ij  ratio  has  been  used  to  provide  insight  into  the  nature  of  solute-fluid  interactions  in  supercritical 
water**  and  A  similar  format  has  been  used  to  quantify  the  intermolecular  interactions  occurring  in  supercritical 

alkane  systems.  Figure  10  presents  Ij/Ij  ratios  as  a  function  of  reduced  density  for  pyrene  in  «-pentane  (Panel  A),  n- 
hexane  (Panel  B),  and  n-heptane  (Panel  C)  at  T,  =  1.01.  From  these  data,  equations  1  through  4  were  used  to  relate  the 
I1/I3  ratios  to  the  physical  properties  of  the  fluid  via  the  dielectric  cross  term  (f(e,  n^)).  Figure  1 1  shows  Ij/Ij  ratios  as  a 
function  of  the  dielectric  cross  term  for  pyrene  in  n-pentane  (Panel  A),  n-hexane  (Panel  B),  and  n-heptane  (Panel  C) 
with  associated  uncertainties.  The  solid  lines  each  represent  the  theoretical  I,/l3  ratios  in  the  absence  of  any  solute- 
solvent  interactions.  Upward  deviation  from  this  line  is  indicative  of  solute-fluid  clustering,  or  an  increase  in  the  local 
density  surrounding  the  pyrene  molecule.''*-'"-**  The  upward  deviation  of  the  experimental  I,/l3  ratios  relative  to  the 
theoretical  line  (no  clustering)  is  then  used  to  calculate  the  local  alkane  density  surrounding  pyrene  in  each  of  these 
alkane  systems.  Figure  12  presents  the  calculated  local  alkane  density  (p,^  divided  by  the  bulk  alkane  density  (p^^  in 
each  of  the  n-alkanes.  The  pj^^  /Pbua  term  can  be  used  to  estimate  the  degree  of  solute-fluid  interactions  occurring 
surrounding  pyrene^*-"-**  in  these  supercritical  alkanes.  Several  things  are  evident  from  inspection  of  Figure  1 2.  First, 
the  maximum  degree  of  fluid  clustering  occurs  at  around  one-half  the  critical  density.  Second,  the  maximum  in  local 
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density  is  3  to  4  times  the  bulk  alkane  density.  Finally,  the  degree  of  clustering  is  similar  in  each  of  the  alkane  systems 
studied. 

It  has  been  previously  shown  for  other  supercritical  fluids  (e.g.,  COj,  HjO)  that  a  maximum  in  solute-fluid 
clustering  occurs  at  approximately  one-half  the  critical  density.'^-^^-**  This  is  fully  consistent  with  our  observations  in  all 
supercritical  alkanes,  although  measurements  have  not  yet  been  made  in  the  veiy  low  density  region.  The  maximum  in 
local  density  in  the  alkane  systems  of  approximately  3  to  4  times  the  bulk  density  is  consistent  with  the  results  seen  in 
other  fluids^*®  (yide  supra)  although  slightly  higher  than  those  observed  in  supercritical  COj. 

Conclusions 

An  increase  in  local  alkane  density  surrounding  pyrene  dissolved  in  supercritical  C,,  C*,  and  C7  n-alkanes  is 
observed  at  one  half  the  critical  density.  The  degree  of  augmentation  (3  to  4  times  p^jk)  and  the  maximum  in  clustering, 
at  one-half  the  critical  density,  is  fully  consistent  with  previous  results  in  other  supercritical  fluids.®*-^’’**  To  our 
knowledge,  this  type  of  clustering  phenomena  has  not  been  previously  observed  within  supercritical  alkane  systems. 

This  work  points  out  that  additional  experimentation  is  needed  to  understand  how  these  solute-solvent  interactions  in 
aviation  fuel  precursors  will  affect  fiael  performance. 

Effects  of  CO,  Sorntion  on  Mohen  Polymers  Dynamics 

Poly(dimethylsiloxane)  (PDMS)  polymers  are  unique  silicone  polymers  which  possess  a  very  low  glass  phase 
transition  temperature  (Tg)  of  ~  1 50K.'”’“  For  this  reason,  PDMS  polymers  are  molten  viscous  polymers  which  exhibit 
flow  characteristics  at  ambient  temperatures  and  are  therefore  model  polymers  for  other  systems  with  much  higher 
processing  temperatures.*’’**  PDMS  polymers  have  found  a  wide  range  of  applications  from  coatings  and  implants  to 
hydrauUc  fluids.®’-** 

There  is  significant  interest  in  developing  simple  methodologies  to  control  the  transport  properties  of  solid  or 
molten  polymers  because  such  governs  aspects  of  polymer  synthesis,  polymer  reactivity,  and  polymer  processing.’’*’-**  It 
is  well-known  that  COj  can  be  used  to  alter  the  characteristics  of  certain  polymer  systems.*’’”  For  example,  recent  work 
has  shown  that  molten  polymers  like  PDMS  can  sorb  tremendous  amounts  of  CO2  leading  to  a  dramatic  decrease  in  the 
polymer  bulk  viscosity.*’’”  However,  there  is  much  to  be  teamed  about  how  gas  effusion  affects  polymer  (tynamics, 
polymer  free  volume,  the  mobility  of  dissolved  solutes,  and  polymer  transport  properties. 

Rotational  reorientation  measurements  have  been  used  extensively  to  determine  the  nature  of  interactions 
occurring  within  both  amorphous  polymer  matrices,”’’*  and  at  elastic  cross-link  junctions  within  various  polymer 
networks.”’*®  For  example,  Fayer  and  co-workers’*  have  recently  used  the  rotational  reorientation  dynamics  of 
dansylamide  attached  to  a  trifunctional  silane  within  PDMS  melts  to  attempt  to  relate  local  rotational  dynamics  to  bulk 
polymer  properties.  However,  the  rotation  of  the  rather  small  (~3A  in  length)  probe  did  not  correlate  well  with  bulk 
polymer  (tynamics. 

We  have  used  a  large,  neutral  solute  in  conjunction  with  time-resolved  fluorescence  anisotropy  techniques  to 
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correlate  the  rotational  motion  of  the  solute  to  bulk  pol)fmer  properties.  Specifically,  we  are  interested  in  the  effects  of 
the  addition  of  CO2  (at  both  ambient  and  supercritical  conditions)  on  the  behavior  of  model  dopants  within  the  polymer 
matrix  and  whether  we  can  track  such  effects.  The  model  solute  that  we  have  chosen  for  our  study  is  BTBP,  which  has 
been  previously  used  for  rotational  reorientation  measurements  within  a  variety  of  media.®*"®  BTBP  is  a  large  (28  A  in 
length)  neutral  solute.  Thus,  there  will  be  no  charge  interactions  with  the  polymer  matrix.  BTBP  has  a  unity  quantum 
yield,**"®  therefore  small  amounts  may  be  dispersed  within  the  polymer  while  maintaining  sufBcient  S/N.  BTBP  has 
been  reported  to  be  a  spherical  rotor  with  a  single  rotational  correlation  time.**"®  For  this  work,  we  have  measured  the 
effects  of  polymer  molecular  weight  and  COj  on  the  rotational  dynamics  of  BTBP  to  correlate  the  rotational  correlation 
time  with  the  polymer’s  properties.. 

Experimental 

Preparation  of  Bulk  Polymer  Solutions 

A  broad  average  molecular  weight  range  (MW  =  1250, 2000, 3780,  5970,  9430, 13650, 28000,  and  49350 
g/mol)  of  methyl-terminated  PDMS  was  purchased  from  United  Chemical  Technologies,  Inc  (Bristol,  PA)  and  used 
without  further  purification.  A^Ar-Bis(2,5-di-/err-bulylphenyl)-3,4,9, 1 0-perylenedicarboximide  (BTBP)  was  purchased 
from  Aldrich  and  used  as  received. 

Stock  solutions  of  BTBP  (1  mM)  were  prepared  in  absolute  ethanol.  BTBP  is  randomly  dispersed  within  the 
PDMS  via  the  following  protocol:  1)  the  appropriate  amount  of  BTBP  stock  solution  is  micropipetted  into  a  vial;  2)  the 
vial  is  then  placed  within  a  hot  oven  for  approximately  1  hr  to  evaporate  any  remaining  ethanol  solvent;  3)  after  cooling, 
the  appropriate  quantity  of  PDMS  is  added  to  make  a  final  solution  that  is  1  pM  BTBP;  and  4)  the  solutions  are  stirred 
(with  gentle  heating  for  higher  molecular  weight  samples)  for  approximately  2  wks  to  thoroughly  disperse  the  BTBP 
throughout  the  polymer.  There  is  no  evidence  of  BTBP  aggregates. 

Addition  of  CO.,  to  the  Polvmer  Samples 

CO2  is  added  to  the  polymer  solutions  via  a  syringe  pump  assembly  which  continuously  delivers  COj  to  a  high- 
pressure  cell  containing  the  polymer  sample.  The  stainless  steel  high-pressure  cell  was  developed  in-house  and  has 
been  described  in  detail  previously.**  The  cell  has  an  optical  pathlength  of  approximately  1  cm  and  contains  quartz 
optical  windows  (Behm  Quartz  Industries,  Dayton,  OH)  which  have  been  previously  shown  to  exhibit  no  detectable 
pressure  induced  birefirngence  over  the  pressme  range  studied.*’ 

The  BTBP/PDMS  solution  (3.75  mL)  is  directly  pipetted  into  the  high-pressure  cell  (internal  volume  =  5mL) 
into  which  a  teflon-coated  stir  bar  has  been  placed.  A  valve  assembly  is  then  connected  to  the  high-pressure  cell  which 
attaches  the  cell  to  a  high  pressure  syringe  pump  (Isco,  Model  260D,  Lincoln,  NE)  operating  in  constant  pressure  mode. 
Throughout  the  experiment,  a  Haake  A80  temperature  bath  is  used  for  temperature  control.  The  temperature  is 
monitored  using  a  solid-state  thermocouple  (Cole  Parmer,  Vernon  Hills,  IL)  and  pressure  is  monitored  within  ±1  psi 
using  a  calibrated  Heise  pressure  gauge. 

The  cell  is  first  charged  to  the  highest  pressure  (-2500  psi)  and  allowed  to  equilibrate  at  experimental 
temperature  with  constant  stirring.  Initially,  polymer  samples  were  equilibrated  overnight  before  rotational  reorientation 
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measurements  were  made.  Figure  1 3  presents  the  recovered  rotational  correlation  time  for  BTBP  in  PDMS  (MW  = 
28000  g/mol)  as  aftinction  of  time  at  approximately  1000  psia.  Time  zero  corresponds  to  12  hrs  of  equilibration  time 
with  stirring.  We  observe  an  initial  increase  in  the  rotational  correlation  time  which  we  attribute  to  increased  pressure 
within  the  system  from  delivery  of  the  COj.  The  systematic  decrease  of  the  rotational  correlation  time  with  time  tracks 
the  difrusion  of  CO^  into  the  PDMS  and  subsequent  dilation  of  the  polymer.  This  systematic  decrease  due  to  continued 
dilation  of  the  polymer  was  shown  to  continue  for  upwards  of  1  wk.  Therefore,  all  subsequent  PDMS/COj  samples 
were  equilibrated  for  approximately  2  wks  with  stirring  at  the  initial  experimental  temperature  and  pressure. 
Instrumentation 

All  time-resolved  measurements  were  made  using  a  multiharmonic  frequency-domain  fluorometer  (SLM- 
AMINCO  48000  MHF).  For  steady-state  measurements,  a  Xe  arc  lamp  is  used  for  excitation  with  a  monochromator  for 
appropriate  wavelength  selection  of  emission  and  excitation  (bandpass  =  4  mn).  The  5 1 4.5  nm  line  of  a  C  W  Ar*  laser 
(Coherent,  Model  Innova  400)  is  used  for  excitation  during  all  time-resolved  experiments.  The  output  from  the  laser  is 
passed  through  an  interference  filter  to  eliminate  any  extraneous  plasma  discharge  from  reaching  the  detector. 

Fluorescence  from  the  sample  is  monitored  through  a  550  longpass  filter  and  a  polarizer  set  at  the  magic  angle 
condition  for  fluorescence  lifetime  measurements.*^  Sinusoidally  modulated  light  is  generated  using  a  Pockels  cell 
driven  at  5  MHZ,  and  data  is  collected  from  5  to  200  MHZ  (39  frequencies).  At  least  9  replicate  measurements  were 
made.  Rhodamine  6G  in  water  was  used  as  the  reference  lifetime  for  all  excited-state  fluorescence  lifetime 
measurements  (t  =  3 .85  ns).**  The  BTBP  fluorescence  lifetime  was  found  to  be  constant  over  the  range  of  experimental 
conditions.  Operation  of  a  typical  MHF  frequency-domain  fluorometer  has  been  described  in  great  detail  elsewhere.*’"*® 
Phase  and  demodulation  data  were  fit  to  various  test  models  by  using  a  commercially  available  global  analysis  software 
package  (Globals  Unlimited).®” 

Frequency-domain  measurements  of  the  time-resolved  decay  of  anisotropy  are  made  by  measurement  of  the 
Hifffrential  phase  angle  (A  =  0^  -  0,)  and  polarized  modulation  amplitude  (A  =  AC,/AC^).  The  decay  of  the  intensity  of 
the  parallel  (I,(t))  and  perpendicular  (1^(1))  components  of  the  polarized  fluorescence  in  the  frequency  domain  may  be 
described  by:*®’®' 


7,(0  =  l/3[/(0(l+2r(0)] 

(7) 

7^(0  =  l/3[7(/)(l-r(0)] 

(8) 

where  r(t)  is  the  fluorescence  decay  of  anisotropy.  Assuming  that  the  fluorophore  is  a  spherical  rotor,  the  decay  of 
anisotropy  can  be  written; 


r(t)  =  roexp(-^) 
9 


(9) 
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where  Tq  is  the  limiting  anisotropy,  the  anisotropy  measured  in  the  absence  of  rotational  motion.  The  rotational 
correlation  time,  «j),  may  then  be  recovered  by  fitting  the  differential  phase  angle  (A)  and  the  polarized  modulation  ratio 
(A)  as  a  function  of  frequency  using  a  non-linear  least  squares  program:®' 


A 


=  arctan[ 


(10) 


A  =  [ 


1/2 


(11) 


where  N  and  D  are  the  polarized  components  of  the  sine  and  cosine  Fourier  transform,  respectively.  The  rotational 
correlation  time  is  then  fit  using  a  non-linear  least  squares  minimization  of  the  chi-squared  parameter  as  defined 

by:®' 

^2  _  ■  ^c(“)^2  .  ■  •^c(“)v,2 

^  ^ - a - ^  ^ ^  (12) 

where  the  subscripts  c  and  m  denote  the  computed  and  measured  differential  phase  angles  and  polarized  modulation 
ratios,  respectively,  o  is  the  variance,  and  D  is  the  number  of  degrees  of  freedom.  The  goodness  of  fit  between  the 
model  and  the  experimental  data  is  determined  by  the  closeness  of  the  parameter  to  unity  as  well  as  the  randomness  of 
the  residuals  around  zero. 

Results  and  Discussion 

Rotational  Reorientation  Dynamics  within  Molten  Polymers 

The  determination  of  the  rotational  correlation  time  of  BTBP  as  a  function  of  MW  allows  us  to  determine  if  our 
large  neutral  solute  (BTBP)  can  track  the  entanglement  value  (M.)  for  PDMS.  The  entanglement  value  is  a 
characteristic  value  of  amorphous  polymers  in  which  the  chains  within  the  polymer  become  too  long  to  slip  past  one 
another  easily.®’-**  Although  the  polymer  will  still  exhibit  flow  characteristics  as  the  MW  increases,  a  “leveling  off’  of 
physicochemical  properties  (i.e.,  viscosity,  refractive  index)  occurs  past  the  entanglement  value.®’-®*  Subsequently,  if 
BTBP  is  large  enough  to  fill  the  free  volume  fully  within  the  polymer  matrix,  the  rotational  correlation  time  should  level 
off  above  the  entanglement  value. 

Figure  14  presents  typical  differential  phase  angle  (Panel  A)  and  polarized  modulation  ratio  (Panel  B)  data  for 
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BTBP  in  several  PDMS  molecular  weight  polymers.  The  points  represent  actual  experimental  data  and  the  lines  are 
recovered  best  fits  to  a  single  exponential  decay  law.  From  this  data,  it  is  obvious  that  as  the  molecular  weight  of  the 
polymer  increases,  the  differential  phase  angle  and  polarized  modulation  ratio  subsequently  increase,  indicating  an 
increase  in  the  rotational  correlation  time.  Figure  15  presents  the  recovered  rotational  correlation  times  as  a  function  of 
PDMS  molecular  weight.  We  see  that  as  the  polymer  molecular  weight  increases,  there  is  an  initial  increase  in  the 
rotational  correlation  time  and  a  leveling  off  at  ~1 0000  g/mol.  Entanglement  values  have  been  reported  for  PDMS 
polymers  in  the  Uterature  ranging  between  ~8000”  and  8625*’  g/mol.  Our  rotational  reorientation  data  “breaks”  at  a 
value  that  is  near  these  reported  literature  values.  Thus,  the  large  size  of  the  BTBP  probe  allows  us  to  easily  track  the 
PDMS  chain  entanglement  process. 

Rotational  reorientation  data  as  a  fiinction  of  PDMS  molecular  weight  also  allows  us  to  establish  a  convenient 
link  between  <j)  and  some  aspect  of  the  polymer.  Figure  16  presents  the  BTBP  rotational  correlation  time  as  a  function 
of  polymer  density.  The  solid  line  is  the  straight  line  fit.  This  will  be  used  later  to  correlate  the  recovered  rotational 
correlation  times  in  PDMS  dilated  with  CO^. 

Effect  of  CO.,  on  PDMS  Dynamics 

Figure  17  presents  the  recovered  rotational  correlation  time  of  BTBP  as  a  function  of  added  COj  pressure  in 
several  representative  molecular  weight  PDMS  polymers  (i.e.,  1250, 9430, 1 3650,  and  28000  g/mol)  at  25  “C.  There 
are  several  aspects  of  this  data  that  merit  special  attention.  Firet,  the  rotational  correlation  time  decreases  dramatically 
(up  to  5  times)  with  the  addition  of  liquid  COj  fi-om  the  rotational  correlation  times  observed  in  the  neat  PDMS  melts.  It 
is  known  that  COj  can  swell  PDMS  up  to  50%  by  weight,**'”  and  our  data  is  fully  consistent  with  the  dilation  and 
subsequent  decrease  in  bulk  density  of  the  PDMS  polymer  with  addition  of  COj.  Second,  it  appears  that  the  decrease  in 
rotational  correlation  time  can  be  further  decreased  by  increasing  the  COj  pressure.  We  observe  that  the  rotational 
correlation  time  “snaps”  to  a  much  lower  value  between  500  and  1000  psi  of  COj  for  all  molecular  weight  polymers 
before  leveling  off  with  increasing  pressure  above  1000  psi.  Examination  of  the  phase  equilibria  for  COj  at  25  °C  shows 
that  the  transition  between  the  gas  and  liquid  state  of  COj  occurs  between  these  pressures.  The  higher  density  liquid 
COj  dilates  and  swells  the  polymer  to  a  greater  extent  than  the  less  dense  gaseous  COj.  The  density  of  the  swollen 
polymer  may  be  estimated  by  using  the  relationship  between  the  PDMS  density  and  the  recovered  rotational  correlation 
time  in  neat  PDMS  melts  as  a  fimction  of  molecular  weight  (Figure  1 6).  Figure  1 8  presents  the  calculated  polymer 
densities  as  a  function  of  added  COj  for  each  of  the  polymers. 

These  results  prompted  us  to  question  whether  the  BTBP  rotational  correlation  time  could  actually  be  tuned  by 
addition  of  COj  to  the  polymer.  From  the  previous  25  °C  experiments,  we  noted  a  sharp  change  in  the  rotational 
correlation  time  as  a  function  of  pressure.  It  is  known  that  COj  above  its  critical  temperature  (T.  =  3 1 . 1  “C;  P^  =  1070.4 
psia)  exhibits  no  phase  boundary  with  increasing  pressure.  Thus,  we  questioned  if  we  could  tune  the  BTBP  rotational 
reorientation  time.  Figure  1 9  presents  the  recovered  rotational  correlation  times  for  BTBP  in  PDMS  (MW  =  9430 
g/mol)  at  ambient  (T  =  25.0  °C)  and  supercritical  (T  =  36.5  *C)  conditions  as  a  function  of  COj  pressure.  The  recovered 
rotational  times  for  BTBP  in  the  PDMS  swelled  with  supercritical  COj  are  slightly  lower  due  to  the  increase  in 
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temperature  which  subsequently  decreases  (|)  (Figure  19A).  More  interestingly,  closer  examination  of  the  T  =  36.5  “C 
data  in  Figure  1 9B  shows  that  instead  of  the  sharp  change  in  rotational  correlation  time  observed  with  increasing  COj 
pressure  in  the  ambient  system  {vide  supra),  we  instead  observe  a  very  gradual  easily  tuned  decrease  in  the  BTBP 
rotational  correlation  time.  The  inherent  tunability  of  the  physicochemical  properties  of  supercritical  COj  with 
temperature  and  pressure  can  therefore  be  used  to  tune  the  PDMS  matrix  and  hence  the  BTBP  rotational  dynamics. 
Conclusions 

We  have  shown  that  our  large  model  solute,  BTBP,  can  be  used  to  track  the  bulk  polymer  dynamics  within  the 
PDMS  polymer  matrix.  The  rotational  correlation  time  is  shown  to  scale  with  polymer  molecular  weight  before  leveling 
oflf  above  the  known  PDMS  entanglement  value.  The  rotational  correlation  time  of  BTBP  within  these  molten  polymers 
has  been  used  to  relate  the  recovered  rotational  dynamics  within  COj-dilated  PDMS  to  the  bulk  density  of  the  swelled 
poljmer  matrix.  We  have  shown  that  the  addition  of  ambient  temperature  COj  to  PDMS  causes  a  dramatic  decrease  in 
the  BTBP  rotational  correlation  time  near  the  COj  gas-liquid  phase  boundary  (25  ”C).  Addition  of  supercritical  COj 
allows  us  to  tune  the  bulk  polymer  density  and  the  BTBP  rotational  dynamics. 

Cosolvent  Effects  on  Rotational  Reorientation  Dynamics  in  Supercritical  CO, 

The  inherent  tunability  of  supercritical  fluids  have  made  them  attractive  for  use  in  separations,*"*  chemical  reactions,*"*’ 
and  extractions.’  Supercritical  CO2  (SCCO2)  environmentally  fiiendly,  inexpensive,  and  has  very  mild  critical 
parameters  (T5  =  31.1  °C;Pj=  1070.4  psia;  =  0.468  g/mL).  Supercritical  CO2  has  found  a  wide  range  of  industrial 
applications  including  replacement  of  hazardous  solvents  in  the  clothing  dry  cleaning  process,  decaffeination  of  coffee 
beans,  and  a  wide  range  of  extractions  and  reactions  in  the  petroleum,  polymer  and  pharmaceutical  industry.’"*  Although 
SCCO2  is  by  far  the  most  commonly  used  supercritical  solvent,  it  is,  unfortunately,  a  rather  poor  solvent  for  polar  solutes. 
CO2  is  nonpolar  and  has  a  relatively  low  solvent  strength.  In  addition,  although  there  is  inherent  tunability  of  the  SCCO2 
solvent  strength  with  pressure  and  temperature,  it  still  exhibits  a  cohesive  energy  density  less  than  cyclohexane  and 
orders  of  magnitude  less  than  common  industrial  solvents  such  as  methanol  and  acetonitrile.  This  low  cohesive  energy 
density  translates  into  a  lack  of  selectivity  for  chromatographic  separations,  extractions  and  reactions. 

There  are  several  approaches  to  “modify”  COj  in  order  to  improve  the  power  and  hence  selectivity  of  the 
solvent.  The  addition  of  small  quantities  (x  =  0. 1  -  5  mole%)  of  an  organic  entrainer  (cosolvent)  have  been  shown  to 
dramatically  increase  solute  loading  as  well  as  increase  reaction/separation  selectivity.™-”"*”  More  recently,  a 
perfluoropolyether-based  surfactant  (PFPE)  has  been  used  to  form  stable  reverse  micelles  in  scCOj.®*  However, 
although  promising,  these  PFPE  microemulsions  are  a  relatively  new  technology  and  have  not  yet  been  thoroughly 
characterized.  In  addition,  there  are  many  recognized  advantages  to  using  the  more  simple  cosolvent-modified  systems. 
For  example,  cosolvents  do  not  significantly  alter  the  CO2  critical  properties.  In  addition,  only  small  quantities  of 
entrainer  need  to  be  added,  giving  both  an  economical  and  a  waste  disposal  advantage.  Enhanced  solute  loading  of  both 
polar  and  nonpolar  solutes  is  achieved  as  well  as  improving  selectivity  of  reactions,  extractions,  and  separations.™-”"” 
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Many  of  these  desirable  properties  have  been  attributed  to  preferential  solvation  of  the  cosolvent  around  the  dissolved 
solute  in  these  supercritical  systems.  Given  this,  the  goal  of  this  work  was  to  experimentally  quantify  preferential 
solvation  in  a  supercritical  fluid  system. 

Rotational  reorientation  (fynamics  measurements  offer  a  convenient  means  to  relate  the  solute’s  cfynamics  to 
the  local  solvent  microdomain  (yide  supra).  We  have  used  the  fluorescent  probe  BTBP  in  concert  with  time-resolved 
fluorescence  anisotropy  techniques  to  determine  rotational  fynamics  of  a  solute  as  a  fiinction  of  MeOH-modified  scCOj 
density.  These  data  have  allowed  us  to  quantify  the  extent  of  solute-fluid  clustering  and  determine  how  the  solvation 
shell  surrounding  the  solute  changes  as  a  function  of  bulk  density. 

Experimental 

Preparation  of  the  Cosolvent  Mixture 

Methanol  (spectrophotometric  grade,  Aldrich)  is  charged  into  a  stainless  steel  vessel  equipped  with  two  1/8" 
tubing  pieces,  one  leading  to  a  SFC  grade  COj  tank  (Scott  Specialty  Gases,  Plumsteadville,  PA)  and  the  other  to  a  high- 
pressure  syringe  pump  (Model  260D,  Isco).  After  charging  the  MeOH,  the  vessel  is  promptly  placed  into  an  ice  bath 
and  CO2  is  added  to  the  vessel  as  it  is  vigorously  shaken  and  stirred  to  ensure  that  flie  mixture  is  one  phase.  The  mixture 
is  then  transferred  into  the  syringe  pump,  and  the  pump  piston  is  run  in  and  out  several  times  to  ensure  mixing.  The 
pump  head  is  then  heated  to  experimental  temperature  (-45  "C)  and  allowed  to  equilibrate  for  several  days  to  complete 
the  mixing  process.  The  MeOH/COj  mixture  is  then  delivered  to  our  high-pressure  stainless  steel  optical  cell  (yide 
supra)  which  has  been  previously  charged  with  BTBP.  All  MeOH/CO,  mixtures  for  this  experiment  were  5  mol% 
MeOH.  The  cell  is  initially  charged  with  the  highest  experimental  pressure  of  the  cosolvent  mixture  (~3000  psia)  and 
the  pressure  is  gradually  decreased  throughout  the  experiment  until  the  mixture  becomes  biphasic. 

All  mixture  densities  and  viscosities  were  calculated  using  the  equations  described  by  Foster  and  co-workers.” 
Results  and  Discussion 

The  steady-state  emission  and  excitation  spectra  of  BTBP  in  MeOH-modified  CO2  at  45°C  and  2667  psia  are 
shown  in  Figure  20.  Figure  21  presents  the  effects  of  density  on  the  excited-state  fluorescence  lifetimes  for  BTBP  in 
MeOH-modified  COj  at  46.0  °C.  As  mentioned  previously,  it  has  been  established  that  the  fluorescence  lifetime  for 
BTBP  is  relatively  constant  as  a  function  of  its  local  environment”-”  However,  we  observe  a  systematic  decrease  in  the 
BTBP  fluorescence  lifetime  with  increasing  MeOFI/COj  bulk  density.  This  phenomena  may  be  readily  explained  by  the 
Strickler-Berg  relationship,  which  links  the  radiative  rate  of  the  fluorophore  (k,)  to  the  properties  of  the  solvent  and 
fluorophore  through  the  following  equation:” 

=  2900n^vljedv  (13) 


where  n  is  the  solvent  refi-active  index,  Vq  is  the  peak  fi-equency  in  the  fluorophore  absorbance  spectrum,  and  j edv  is  the 
integrated  area  imder  the  fluorophore  absorbance  spectrum.  The  radiative  rate  is  directly  related  to  the  quantum  yield 
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("I*)  and  fluorescence  lifetime  (x)  of  the  fluorophore  (k,  =  4>/t).  If  the  quantum  yield  of  the  probe  is  unity,  the  radiative 
rate  is  the  inverse  of  the  fluorescence  lifetime.®  Because  the  radiative  rate  is  related  to  the  square  of  the  refractive  index, 
if  the  observed  change  in  fluorescence  lifetime  is  due  to  the  changing  refractive  index  with  increasing  pressure,  a  plot  of 
l/x  vs  n^  should  be  linear.  Figure  22  shows  that  a  Strickler-Berg  interpretation  of  our  data  reasonably  explains  (r^  = 
0.926)  the  decrease  in  fluorescence  lifetime  of  BTBP  as  the  refractive  index  of  the  MeOH/COj  mixture  increases  with 
increasing  pressure.  Similar  results  (not  shown)  have  been  determined  for  BTBP  in  neat  COj  at  46.0  "C. 

Through  the  Debye-Stokes-Einstein  equation,  the  rotational  reorientation  time  of  a  solute  can  be  directly 
related  to  the  physical  parameters  of  the  system,  including  the  viscosity,  temperature,  and  the  volume  of  the  reorienting 
species.’*  Figure  23  presents  the  experimental  rotational  correlation  time  for  BTBP  as  a  function  of  cosolvent  mixture 
bulk  density  at  46.0  “C.  The  rotational  correlation  time  predicted  from  the  DSE  equation  (given  the  volume  of  BTBP, 
the  temperature,  and  the  calculated  bulk  viscosity)  as  well  as  the  rotational  correlation  time  for  BTBP  in  neat  COj  are 
denoted.  In  the  low  density  region,  we  see  that  the  rotational  correlation  time  deviates  significantly  from  predicted 
values  and  only  nears  the  DSE  prediction  at  higher  densities.  This  same  phenomena  has  been  observed  for  BTBP  in 
neat  CO2  at  35  °C*'  and  is  attributed  to  a  clustering  of  solvent  molecules  around  the  solvent,  increasing  the  fluorophore 
volume  and  hence  the  rotational  correlation  time.  This  solute-fluid  clustering  dissipates  at  higher  liquid-like  bulk 
densities  causing  the  rotational  correlation  time  to  more  closely  follow  DSE  predictions.  The  rotational  correlation  time 
in  our  system  can  allow  us  to  predict  the  size  of  the  reorienting  species  as  well  as  determine  the  solvation  of  MeOH 
surrounding  the  BTBP. 

The  observed  rotational  correlation  time  (<|)„b,)  can  be  used  in  conjunction  with  the  rotational  correlation  time  of 
BTBP  in  MeOH  (<1)m^„  =  1 50  ps)  and  the  rotational  correlation  time  in  neat  COj  (4)co2  =  34  ps)  to  estimate  the  fraction 
of  MeOH  (fMeoa)  and  COj  (fcoj)  observed  by  the  probe  through  the  following  equation: 


^MeOH  fMeOH 


^co.,  fco. 


(14) 


Figure  24  presents  the  percentage  of  MeOH  estimated  using  this  analysis  as  a  function  of  fluid  density.  Recall  that  the 
bulk  MeOH  mole  fraction  is  only  5%.  We  observe  a  very  large  degree  of  preferential  solvation  (30%  MeOH;  6  fold) 
surrounding  the  BTBP  in  the  low  density  region  before  leveling  off  in  the  higher  density  region.  It  is  interesting  to  note 
that  the  percentage  of  MeOH  in  the  higher  density  region  is  actually  less  than  the  expected  for  a  5  mol%  solution.  This 
may  in  part  be  due  to  the  oversimplification  of  Equation  14  in  predicting  the  MeOH  solvation. 

The  degree  of  “solvent”  clustering  around  the  BTBP  probe  can  be  roughly  estimated  by  dividing  the  recovered 
experimental  rotational  correlation  time  (4>«p^„  J  by  the  rotational  correlation  time  predicted  by  the  DSE  equation 
(4*dse)-  Figure  25  presents  <1> experimental  /^DSE  ^  *  function  of  bulk  density  for  the  cosolvent  mixture.  These  average  values 
show  that  in  the  low  density  region,  we  are  observing  an  increase  in  the  BTBP  rotational  reorientation  on  the  order  of  5 
times  the  expected  value.  This  increase  in  the  rotational  correlation  time  is  consistent  with  local  density  augmentation  of 
the  solvent  surrounding  the  solute.  In  turn,  this  is  consistent  with  other  systems*'  where  clustering  phenomena  decreases 
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with  increasing  bulk  density  of  the  mixture. 

Conclusions 

For  the  MeOH-modified  COj  system  we  see  a  significant  decrease  in  the  fluorescence  lifetime  of  BTBP  with 
inrrftflsing  bulk  mixture  density.  This  phenomena  can  be  readily  explained  by  use  of  the  Strickler-Berg  analysis, 
showing  that  the  change  in  BTBP  fluorescence  lifetime  is  due  to  changing  solvent  refractive  index.  We  observe  that  the 
rotational  correlation  time  for  BTBP  in  MeOH-modified  COj  deviates  significantly  fi-om  the  Debye-Stokes-Einstein 
prediction  and  the  recovered  rotational  correlation  time  in  neat  COj.  These  deviations  can  be  used  to  predict  the  local 
composition  surrounding  the  BTBP  in  the  MeOH-modified  COj  system.  Preliminary  results  show  that  preferential 
MeOH  solvation  exists  surrounding  BTBP  in  MeOH-modified  COj  in  the  low  bulk  density  region  and  that  the  local 
composition  surrounding  the  probe  can  be  estimated  at  approximately  5  times  the  bulk  density  in  the  low  density  region. 

Summary 

Steady-state  and  time-resolved  fluorescence  spectroscopy  were  used  to  quantify  solvation  phenomena 
occurring  in  several  neat  and  modified  supercritical  fluid  systems.  Measurement  of  pyrene  I,/!,  ratios  allowed  us  to 
experimentally  determine  solute-fluid  interactions  occurring  in  supercritical  water.  These  results  showed  that  near  one 
half  the  fluid  critical  density,  the  local  density  augmentation  is  on  the  order  of  five  times  greater  than  the  bulk  water 
density.  Pyrene  was  further  used  to  quantify  intermolecular  interactions  in  supercritical  n-alkanes  (fiiel  precursors).  An 
increase  in  local  alkane  density  surrounding  pyrene  dissolved  in  supercritical  C5,  Cg,  and  C7  n-alkanes  was  observed  at 
one  half  the  critical  density  with  a  degree  of  augmentation  on  the  order  of  3  to  4  times  the  bulk  alkane  density.  These 
results  have  shown  that  fiarther  experimentation  is  needed  for  a  complete  picture  of  how  solute-fluid  interactions  in 
aviation  fuel  precursors  may  affect  fuel  performance.  We  have  used  rotational  reorientation  measurements  of  a  large 
neutral  solute,  BTBP,  to  track  bulk  polymer  dynamics  within  several  PDMS  polymers.  We  have  shown  that  the  addition 
of  ambient  temperature  CO2  to  PDMS  causes  a  dramatic  decrease  in  the  BTBP  rotational  correlation  time  near  the  COj 
gas-liquid  phase  boundary  (25  °C).  In  addition,  we  showed  that  addition  of  supercritical  COj  allowed  us  to  tune  the  bulk 
polymer  density  and  the  BTBP  rotational  dynamics.  We  have  also  used  BTBP  rotational  reorientation  measurements  to 
quantify  preferential  solvation  in  MeOH-modified  COj.  We  observed  that  the  rotational  correlation  time  for  BTBP  in 
MeOH-modified  COj  deviated  significantly  from  the  Debye-Stokes-Einstein  prediction  and  the  recovered  BTBP 
rotational  correlation  time  in  neat  COj.  These  deviations  were  used  to  predict  the  local  composition  surrounding  the 
BTBP  in  the  MeOH-modified  COj  system.  Preliminary  results  have  shown  that  preferential  MeOH  solvation  exists 
surrounding  BTBP  in  MeOH-modified  COj  in  the  low  bulk  density  region.  From  these  results,  the  local  composition 
surrounding  the  probe  can  be  estimated  to  be  approximately  5  times  the  bulk  density  in  the  low  density  region. 
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Figure  14. 


Simplified  schematic  of  the  high-pressure  titanium  cell  used  in  this  work.  Abbreviations:  FO, 
polyimide-coated  fiber  optic;  TF,  titanium  flange;  GW,  gold  washers;  SW,  sapphire  windows;  TB, 
titamum  cell  body;  and  HPF,  HiP  high-pressure  fittings. 

Simplified  schematic  of  the  apparatus  for  performing  steady-state  fluorescence  experiments  in  SCW. 
Abbreviations:  VP,  vacuum  pump;  FPT,  fi-eeze-pump-thaw  cell;  P,  high-pressure  syringe  pump;  PT, 
pressure  transducer;  GC.  GC  oven;  HCL,  He-Cd  laser;  L,  lens;  T,  XYZ  translator;  FO,  fiber  optic; 
TC,  titanium  high-pressure  cell;  FR,  flow  restrictor;  PC,  preheater  coil;  TH,  thermocouple;  M, 
monochromator;  D,  photomultiplier  tube  detector;  and  PC,  personal  computer. 

Simplified  schematic  of  the  apparatus  for  performing  time-resolved  fluorescence  experiments  in 
SCW.  Abbreviations:  NL,  nitrogen  laser;  BS.  beam  splitter;  I,  iris;  L,  lens;  T,  XYZ  translator,  TC, 
titanium  high-pressure  cell;  BPF,  bandpass  filter;  PD,  photodiode;  PMT,  photomultiplier  tube;  OSC, 
digital  sampling  oscilloscope;  PC,  personal  computer. 

Normalized  emission  spectra  for  pyrene  in  water  at  26.6, 281 .6“C  and  2450  psia,  and  379.7  °C  and 
3040  psia. 

Pyrene  Ij/Ij  ratios  as  a  fiinction  of  reduced  density  for  pyrene  in  water  at  26.6, 77.4, 127.4, 203.8, 
281.6,  379.7,  384.4  and  398.8  “C. 

Pyrene  I, /I,  ratio  as  a  fimction  of  the  dielectric  cross  term,  f(e,  n*)  for  the  T  =  379.7  °C  data.  The 

theoretical  line  is  based  on  gas  and  high-density  liquid  water  values.  The  dashed  “T^”  line 
is  the  theoretical  line  which  has  been  compensated  for  the  known  decrease  in  I, /I,  with  temperature. 
Recovered  local  density  augmentation  (p^eyPhuik)  as  a  fimction  of  reduced  density  for  pyrene  in  SCW 
at  T  =  379.9,  384.4,  and  398.8  °C. 

Excited-state  fluorescence  intensity  decay  traces  for  pyrene  in  water  at  T  =  29.0,  125.7, 272.2,  and 
379.5  “C. 

Density/temperature-dependent  pyrene  fluorescence  lifetimes  in  water  at  T  =  29.0  (•),  76.6  (■), 
125.7  (A),  272.2  (T),  and  379.5  (♦)  ®C.  (Inset)  An  Arrhenius  plot  of  the  lifetime  data. 

Pyrene  I,/l3  ratios  as  a  fimction  of  reduced  density  at  T,  =  1 .01  in  n-pentane  (Panel  A);  «-hexane 
(Panel  B);  and  n-heptane  (Panel  C). 

Pyrene  I, /I,  ratios  as  a  fimction  of  the  dielectric  cross  term,  f(e,n^),  at  T,  =  1 .01  in  n-pentane  (Panel 
A);  n-hexane  (Panel  B);  and  «-heptane  (Panel  C). 

Recovered  local  density  augmentation  as  a  fimction  of  reduced  density  at  T,  =  1 .01  for 

pyrene  in  n-pentane  (Panel  A);  n-hexane  (Panel  B);  and  n-heptane  (Panel  C). 

Recovered  rotational  correlation  time  as  a  fimction  of  time  for  BTBP  in  PDMS  (MW  =  28000  g/mol). 
Differential  phase  angle  (A)  as  a  fimction  of  frequency  for  BTBP  in  PDMS  (MW  =  1 250,  3780,  and 
28000  g/mol)  (Panel  A);  Polarized  modulation  ratio  (A)  as  a  fimction  of  frequency  for  BTBP  in 
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Figure  15. 
Figure  16. 
Figure  17. 
Figure  18. 
Figure  19. 

Figure  20. 

Figure  21. 
Figure  22. 
Figure  23. 

Figure  24. 
Figure  25. 


PDMS  (MW  =  1250, 3780,  and  28000  g/mol)  (Panel  B). 

Recovered  BTBP  rotational  correlation  times  in  PDMS  as  a  function  of  increasing  molecular  weight. 
M,  denotes  the  entanglement  values  reported  for  PDMS  in  the  literature. 

Natural  log  of  the  recovered  BTBP  rotational  correlation  time  as  a  fimction  of  PDMS  density.  The 
solid  line  represents  the  first  order  fit  to  the  data. 

Recovered  BTBP  rotational  correlation  times  in  PDMS  (MW  =  1250, 9430, 13650,  and  28000 
g/mol)  as  a  fimction  of  COj  pressure  at  25  'C. 

Calculated  densities  of  C02-dilated  PDMS  (MW  =  1250, 9430, 13650,  and  28000  g/mol)  as  a 
fimction  of  COj  pressure  at  25  °C. 

Recovered  BTBP  rotational  correlation  times  in  PDMS  (MW  =  9430  g/mol)  as  a  fimction  of  COj 
pressure  at  ambient  (25  °C)  and  supercritical  (T  =  36.5  “C)  conditions  (Panel  A);  Expanded  view  of 
the  T  =  36.5  "C  data. 

Steady-state  emission  and  excitation  spectra  for  BTBP  in  MeOH-modified  COj  at  46.0  “C  and  2667 
psia. 

BTBP  fluorescence  lifetime  as  a  fimction  of  cosolvent  mixture  bulk  density  at  46.0  °C. 
Strickler-Berg  plot  for  BTBP  in  MeOH-modified  CO2  at  46.0  ®C. 

Recovered  BTBP  rotational  correlation  times  in  MeOH-modified  COj  at  46.0  “C.  The  dashed  line 
indicates  the  rotational  correlation  times  predicted  by  the  Debye-Stokes-Einstein  equation.  The 
recovered  rotational  correlation  time  in  neat  COj  at  the  experimental  temperature  is  marked  for 
reference. 

Percentage  ofMeOH  surrounding  BTBP  calculated  using  Equation  15  as  a  fimction  of  bulk  solvent 
mixture  density. 

Recovered  local  density  augmentation  (<J)„cp,mi„,ai  /4>dse)  for  BTBP  in  MeOH-modified  CO2  as  a 
fimction  of  bulk  cosolvent  mixture  density. 
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IMPACT  INITIATION  OF  TRITONAL  AND  PBX  N109 


Keith  M.  Roessig 
Graduate  Student 

Department  of  Aerospace  &  Mechanical  Engineering 
University  of  Notre  Dame 

Abstract 

Recently,  the  behavior  of  explosive  materials  under  intermediate  loading  rates  (10°-10^  s“^)  has  become 
of  interest.  Shock  and  quasi-static  behaviors  have  been  well  characterized,  but  in  weapons  such  as  a  deep 
earth  penetrator,  loading  conditions  in  intermediate  strain  ranges  will  occur  previous  to  initiation  and  affect 
the  response  of  the  explosive.  This  paper  will  examine  the  mechanical  behavior  of  a  cure  cast  simulant, 
Filler-E,  and  a  melt  cast  simulant  under  three  widely  different  strain  rates.  These  tests  will  include  quasi¬ 
static  loading  rates  on  an  MTS  machine,  low  velocity  tests  on  a  mechanical  press,  and  high  velocity  tests 
incorporating  a  Kolsky  bar  apparatus  at  the  University  of  Notre  Dame.  Impact  tests  of  the  explosives 
Tritonal  and  PBX-N109  conducted  at  the  Advanced  Warhead  Experimentation  Facility  at  Eglin  AFB,  FL 
will  be  discussed. 

Filler-E  and  Tritonal  are  shown  to  be  a  very  brittle  material,  while  the  cure  cast  simulant  and  PBX  N109 
are  much  more  ductile.  All  these  materials  have  strengths  well  below  that  of  steel,  usually  4340  is  used  in 
bomb  casings,  and  can  be  neglected  in  comparison.  Slight  strain  rate  hardening  is  seen  in  both  simulants, 
though  for  Filler-E,  failure  strains  drop  tremendously  as  failure  becomes  a  dynamic  fracture  event.  The 
impact  tests  on  Tritonal  and  PBX-N109  show  that  these  materials  need  to  have  high  hydrostatic  loads  after 
failure  to  generate  the  internal  friction  needed  for  initiation. 
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IMPACT  INITIATION  OF  TRITON AL  AND  PBX  N109 

Keith  M.  Roessig 


1  Introduction 

Explosives  axe  used  in  a  wide  variety  of  applications,  from  mining  to  military  weaponry  to  metal  forming. 
Understanding  their  behavior  is  very  important  in  all  applications  to  allow  for  their  efficient  and  safe  use. 
Safety  is  especially  critical  in  the  handling  and  storage  of  explosives,  and  preventing  sympathetic,  or  un¬ 
planned,  detonation  is  one  of  the  primary  goals  of  the  explosive  engineer.  In  addition,  design  applications 
also  require  a  thorough  knowledge  of  the  mechanical  and  thermal  behaviors  of  these  materials. 

The  theory  of  detonation  of  explosives  is  a  very  complicated  one,  and,  due  to  the  small  time  scales 
involved,  has  not  been  studied  extensively  by  many  academic  institutions.  A  key  element  of  detonation 
theory  is  the  coupling  between  mechanics  and  chemistry  during  the  reaction.  This  coupling  cannot  be 
ignored  and  should  be  included  in  any  realistic  model  [6].  Frequently,  the  ignition  is  assumed  to  be  heavily 
dependent  on  the  formation  of  hot  spots,  localized  regions  of  intense  heat  generation.  Factors  that  can  lead 
to  hot  spot  formation  and  ignition  include  jetting,  void  collapse,  viscous  heating,  shock  interaction,  internal 
friction  and  adiabatic  shear  localization  [2,  6].  Adiabatic  shear  localization  occurs  at  high  strain  rates  when 
shear  deformation  may  cause  thermal  softening  through  the  plastic  work  done  on  the  material.  The  softer 
material  deforms  more,  causing  further  heating.  This  self  feeding  process  can  become  localized  into  a  very 
small  region,  and  the  local  temperatures  can  become  very  high.  Shear  banding  has  been  relatively  unexplored 
experimentally  as  a  source  of  ignition  in  solid  explosives.  Field  et  al.  [10]  took  high  speed  photographs  of 
explosives  initiating  from  hot  spot  formation  produced  by  the  variety  of  mechanisms  listed  above.  Evidence 
of  shear  localization  was  found  in  some  tests.  Boyle  et  al.  [3]  investigated  ignition  of  certain  explosives 
under  combined  pressure  and  shear  conditions  and  found  that  shear  bands  do  form  in  the  interior  of  the 
explosive  when  hydrostatic  pressure  is  applied.  Chou  [5]  has  run  numerical  simulations  of  the  impact  of 
various  explosives  with  steel  projectiles  and  found  shear  bands  to  form.  Temperatures  within  these  bands, 
however,  were  not  always  great  enough  to  cause  initiation.  Understanding  how  these  mechanisms  interact 
under  high  strain  rates  is  essential  to  the  proper  design  and  safe  use  of  reactive  materials. 

Recently,  greater  emphasis  has  been  placed  on  determining  the  mechanical  and  reactive  properties  of 
explosives  deforming  under  lower  strain  rates.  Initiation  under  very  high  strain  rate  loadings  such  as  shock 
waves  occur  due  to  adiabatic,  compressive  heating,  while  initiation  under  very  small,  or  quasi-static,  loading 
rates  occurs  mainly  from  external  heating.  Loading  cases  more  typically  seen  by  these  materials  result  in 
strain  rates  between  these  extremes  during  use.  One  example  is  the  deep  earth  penetrator.  The  explosives 
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inside  the  weapon  will  undergo  various  loading  rates  and  loading  geometries  before  reaction  occurs.  How  the 
explosives  behave  during  the  loading  and  at  detonation  by  the  fuse  is  critical  to  successful  implementation 
of  these  weapons. 

The  explosives  Tritonal  and  PBX-N109  are  examined  in  this  paper.  These  are  respectively  a  melt  cast 
trinitrotoluene  (TNT)  based  explosive  and  a  cure  cast  plastic  bonded  explosive  (PBX)  containing  the  Royal 
Development  Explosive  (RDX)  trinitro  triazacyclohexane.  No  pressed  PBX  explosives  were  used.  In  the 
interest  of  safety,  the  inert  melt  cast  simulant,  Filler-E,  and  inert  cure  cast  PBX  simulants  were  used  in  the 
punch  tests  at  the  University  of  Notre  Dame.  It  is  important  to  study  these  materials  for  several  reasons;  1) 
to  determine  the  behavior  of  these  materials,  2)  determine  how  closely  they  imitate  the  mechanical  behavior 
of  the  explosives  they  simulate,  3)  gain  knowledge  of  the  mechanical  behavior  without  reaction  for  safer 
experiments,  and  4)  record  data  to  be  used  in  finite  element  codes  such  as  EPIC  or  ABAQUS  to  model  other 
experiments  where  they  are  used. 

Though  shear  localization  can  be  described  by  a  material  process,  geometry  plays  an  important  role  in 
the  event.  Failure  can  be  constrained  to  a  shear  dominated  mode  by  the  experimental  setup.  Punching 
and  plugging  are  two  situations  in  which  adiabatic  shear  localization  can  occur.  Plugging  is  the  process 
in  terminal  ballistics  in  which  failure  occurs  in  a  shear  dominated  mode,  though  bending  deformation  can 
be  quite  large.  The  clearance  between  the  projectile  and  any  support  is  very  large,  so  in  effect  it  is  an 
infinite  plate.  Usually  plugging  occurs  at  large  impact  velocities,  depending  upon  the  material  of  the  target 
and  projectile.  Punching,  on  the  other  hand,  requires  a  very  small  clearance  between  punch  and  die,  and 
usually  occurs  in  manufacturing.  Punching  velocities  are  small  and  the  failure  is  again  largely  dominated 
by  shear  since  the  die  prevents  much  bending  deformation.  Figure  1  shows  the  geometric  difference  between 
the  punching  and  plugging.  The  punch  test  was  chosen  here  because  higher  strain  rates  can  be  obtained  at 
lower  punch  velocities  with  the  shear  deformation  constrained  to  a  smaller  area. 

Load-displacement  diagrams  are  very  important  in  the  analysis  of  punch  tests  on  materials.  The  various 


Clearance 


Figure  1:  Geometries  of  punching  and  plugging  operations 
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sections  of  the  load  displacement  diagram  indicate  what  may  be  occurring  during  each  phase  of  the  failure 
process,  see  Figure  2.  The  energy  used  in  the  process,  the  area  under  the  curve,  can  be  correlated  to  the 
constitutive  characteristics  of  the  material.  Yield  and  ultimate  strengths  as  well  as  strain  hardening  values 
can  be  estimated  from  these  curves.  Most  of  the  area  under  the  graph  is  energy  being  used  to  deform 
the  material,  both  elastically  and  plastically.  Once,  the  material  begins  to  fracture,  however,  most  of  the 
required  punching  energy  comes  from  friction.  This  is  a  schematic,  and  actual  load-displacement  curves  will 
vary  depending  on  punch  velocity  and  material.  Average  stress-average  strain  curves  can  be  determined 
from  the  load-displacement  data. 


Figure  2:  Load-displacement  curve  schematic  (Bai  and  Johnson,  1982) 


2  Experimental  Method 

2.1  Tests  on  Simulants 

Three  punch  tests  were  conducted  at  different  velocities  of  2.0x  10“®  m/s,  1  m/s,  and  ~10  m/s  using  the 
insert/die  configuration  shown  in  Figure  3.  The  insert  slides  into  the  die,  and  the  simulant  is  placed  in  a 
recess  machined  for  the  specimen.  The  projectile  slides  through  the  hole  in  the  cover  plate,  which  is  bolted 
to  the  die,  and  can  easily  pass  through  the  insert  and  die. 

The  melt  cast  simulant,  Filler-E,  and  cure  cast  PBX  simulant  were  used  in  this  study.  All  the  specimens 
were  discs  of  the  same  size,  0.25”  thick  and  2”  in  diameter.  These  materials  were  obtained  from  the  High 
Explosive  Research  and  Development  (HERD)  division  of  the  Wright  Labs  Armament  Directorate  at  Eglin 
AFB.  The  clearance  between  the  punch  and  die  in  all  the  tests  was  2.54mm.  Three  separate  testing  methods 
were  used  to  determine  load-displacement  curves  at  a  wide  range  of  strain  rates. 


Figure  3:  Insert/die  configuration  for  MTS  and  mechanical  press 

Quasi-static  tests  were  conducted  on  a  MTS  810  20  kip  tension/compression  machine  at  an  average  shear 
strain  rate  of  5.0x10-3.  The  load  was  read  directly  from  the  load  cell  on  the  MTS,  while  the  displacement  was 
obtained  through  a  Epsilon  Technology  Corp.  3540-1000-ST  defiectometer.  This  instrument  is  similar  to  an 
extensometer,  but  uses  strain  gages  to  measure  the  deflection  of  a  small  shaft.  Using  the  calibration  from  the 
manufacturer,  voltage  traces  can  be  recorded  an  a  digital  oscilloscope  and  then  converted  to  displacements. 
Though  the  MTS  machine  does  have  its  own  position  indicator,  it  was  found  that  there  was  too  much  error 
from  the  machine  compression  of  the  projectile  and  die  to  measure  such  small  displacements.  Voltages  were 
recorded  on  digital  oscilloscopes  and  then  transferred  to  a  PC  for  analysis. 

The  low  velocity  tests  were  conducted  on  a  mechanical  press  machine  shown  in  Figure  4.  The  same  die 
configuration  was  used  on  the  mechanical  press  as  the  MTS  machine.  The  punch  velocity  was  approximately 
1  m/s,  giving  an  average  strain  of  ~200.  Strain  gages  were  placed  on  the  punch  to  determine  the  load  upon 
the  specimen  during  the  test.  Output  traces  were  recorded  on  a  digital  oscilloscope.  The  displacement  was 
recorded  with  the  same  defiectometer  used  with  the  MTS  machine. 

For  the  high  speed  tests,  a  punch-loading  Kolsky  bar  apparatus  was  built  to  perform  the  punch  tests. 
This  apparatus  is  described  in  volume  8  of  the  the  ASM  Handbook  [1]  and  had  been  successfully  used  by 
Dowling  et  al.  [8]  and  Zurek  [13].  The  basic  design  is  shown  in  Figure  5. 

Using  an  air  gun  a  long  projectile  is  fired  at  the  Kolsky  bar.  This  sends  a  stress  pulse  down  the  length 
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Figure  4:  Mechanical  press  used  for  low  velocity  tests 

of  the  bar  that  can  be  measured  using  strain  gages  placed  on  the  bar  as  shown.  As  the  pulse  reaches  the 
end  of  the  bar,  the  bar  will  impact  the  specimen.  The  stress  pulse  will  be  partially  reflected  and  partially 
transmitted.  The  reflected  portion  of  the  pulse  is  measured  by  the  strain  gages  that  also  measured  the  initial 
stress  pulse.  While  the  transmitted  portion  could  be  measured  using  strain  gages  on  a  die  tube  behind  the 
specimen,  it  is  not  needed  in  this  case;  due  to  the  large  diameter  of  the  die  as  compared  to  the  bar,  less  than 
1%  of  the  pulse  is  transmitted  to  the  die.  It  is  consequently  assumed  that  there  is  no  transmitted  wave,  thus 
simplifying  the  analysis  of  the  results  (section  2.1.1).  Average  strain  rates  of  up  to  1.0x10“*  can  be  obtained 
with  the  apparatus  at  Notre  Dame. 


Figure  5:  Punch-loading  Kolsky  bar  apparatus 
2.1.1  Elastodynamic  Analysis 

In  the  Kolsky  bar  experiment,  the  following  analysis  [13]  may  be  used  to  get  load-displacement  and  average 
stress-average  strain  curves.  During  use  of  a  Kolsky  bar.  Figure  6,  stress  pulses  are  sent  down  the  length  of 
the  bar  to  cause  a  specimen  to  fail.  From  elastodynamics,  the  stress  in  an  elastic  wave  can  be  related  to  the 


particle  velocity  by  the  relation 


du  E  du  du 

where  E  is  the  elastic  modulus,  c  is  the  elastic  wave  speed,  a  is  the  axial  stress,  and  the  term  du/dt  describes 
the  particle  velocity,  V ,  in  the  bar.  This  velocity  will  have  a  different  sign  depending  on  the  direction  of 
travel  (sign  of  c)  of  the  pulse  itself  and  the  sign  of  the  stress,  and  there  are  two  relations  describing  the  stress 
in  relation  to  the  particle  velocity. 

CT  =  /  traveling  wave 

\  —pcV,  right  traveling  wave 


Figure  6:  Punch-loading  Kolsky  bar  apparatus 

For  the  incident  pulse  on  the  Hopkinson  bar,  the  right  traveling  wave  relations  are  used,  so  the  incident 
stress  is 


CTj  =  -pcVi. 

Because  the  bar  remains  elastic,  the  strain  can  be  given  by 


€i  = 


E 


and  then  substituted  into  the  stress  equation  to  yield 


pc 

The  relations  for  the  reflected  wave  are  similar,  but  these  are  left  traveling  waves,  so  the  elastic  stress  is 
defined  as 


CTr  =  pcVr 
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This  leads  to  a  final  expression  for  Vr  of 


The  shear  strain  in  the  specimen,  7r2,  is  defined  as 


1  /  dur  duz  \ 
~  2  ~d^ ) 


where  Ur  is  the  displacement  in  the  r  direction  and  Uz  is  the  displacement  in  the  z  direction.  In  the  punch 
test,  dur/dz  =  0,  and  duz/dr  «  0.  Therefore,  the  shear  strain  can  be  approximated  by 


7r2  —  « 

2  w 


The  displacement  at  the  end  of  the  bar  is  defined  as 


Uz=  j  (Vi  +  Vr)dt 


Substituting  in  for  Vi  and  Vr, 


-  = 

=  j  -C{€i  -  €r)dt 

Uz  =  —cj  (e,  -  er)dt 


So  now  shear  strain  and  strain  rate  become 


irz  = 


respectively. 


The  shear  stress,  r,  is  defined  as  the  shear  force  divided  the  shear  area  where 


■^shear  —  ^^6  +  2x  71  X  h 

TTC/^ 

Fahear  =  -^bE  (Cj  +  er)  —  ~^E  (Cj  +  Cr) 


Combining  the  above  equations  yields 


^bE  ,  . 


Equations  2,  3,  4  are  used  to  determine  Trz,  Irz,  and  •yrz  for  each  test. 
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2.2  Explosives  at  AWEF 


Experiments  on  actual  explosives  were  conducted  at  the  Advanced  Warhead  Experimentation  Facility  (AWEF) 
at  Eglin  AFB,  Florida.  This  facility  is  in  the  Armament  Branch  of  Wright  Labs  and  has  the  facilities  to 
conduct  perform  such  tests  safely  with  ease.  Using  a  12.7mm  (0.50”  caliber)  powder  gun,  cylindrical  steel 
projectiles  76.4mm  in  length  were  shot  the  specimens  at  velocities  around  300  m/s.  Pressure  transducers 
with  a  known  separation  length  at  the  end  of  the  barrel  produced  voltage  traces  recorded  on  an  oscilloscope, 
allowing  a  velocity  to  be  calculated. 

Both  Tritonal  and  PBX-N109  specimens  were  made  36mm  in  diameter  and  6.5mm  thick.  The  specimens 
were  placed  in  a  removable  insert  which  went  into  the  die,  see  Figure  7.  The  inserts  were  made  of  4340 
steel  hardened  to  Rockwell  C  of  approximately  45.  These  inserts  produced  a  clearance  of  2.54mm  between 
the  projectile  and  inner  diameter  of  the  insert.  Thin  aluminum  discs  were  placed  behind  the  specimens  to 
prevent  loss  of  material  due  to  spalling.  Cover  plates  made  of  titanium  6%  Al-4%  V  alloy  were  used  in  some 
shots  to  contain  the  explosive  and  determine  the  effect  of  a  cover  plate. 


Die 


Figure  7;  Die/insert  configuration 

The  end  of  the  powder  gun  and  the  specimen/die  setup  were  all  placed  inside  a  steel-framed  Lexan  tank. 
The  tank  contained  2  feet  of  celotex  behind  the  die  to  catch  the  projectile  before  impacting  the  back  plate  of 
the  tank.  A  mirror  was  placed  on  one  side  of  the  die  to  allow  two  views  of  the  event  to  be  captured  on  each 
frame  of  film.  A  Gordin  330  high  speed  camera  was  placed  on  the  opposite  side  of  the  mirror.  See  Figure 

8.  This  allows  for  a  side  view  and  an  angle  view  of  the  projectile  impact  in  each  frame.  Framing  rates  were 
approximately  250,000/sec. 

The  use  of  high  speed  photography  can  play  two  important  roles  in  investigating  the  processes  mentioned 
above.  First,  the  failure  process  can  be  examined.  The  amount  of  deformation  and  fracture,  or  a  combination 
of  the  two,  recorded  on  the  film  allows  the  determination  of  the  failure  mode  under  different  loading  cases. 
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die  /  specimen  projectile 


Figure  8:  Experimental  setup  for  test  shots 

Secondly,  the  initiation  of  the  materials  can  be  captured  by  eliminating  any  external  light  sources.  The 
light  generated  by  the  reaction  itself  can  be  used  to  expose  the  film.  This  allows  the  amount  of  reaction, 
if  any  occurs,  to  be  recorded.  By  examining  the  events  captured  on  film,  insight  can  be  gained  on  how  the 
mechanical  and  chemical  processes  within  the  reactive  material  interact  together. 

2.3  Hugoniot  Analysis 

As  stated  in  the  introduction,  one  method  of  initiation  of  explosives  is  by  shock  waves.  Many  experiments 
have  have  been  performed  to  characterize  the  shock  response  of  explosives  under  different  conditions.  This 
experiment  is  examining  the  possibility  of  initiation  through  adiabatic  shear  localization  or  internal  friction. 
To  eliminate  shock  detonation  as  a  possibility,  the  shock  must  be  characterized  for  different  impact  velocities. 

A  one  dimensional  shock  shock  can  be  modeled  as  an  instantaneous  jump  in  a  material  from  one  state 
to  another.  This  disturbance  travels  along  at  a  speed  U*,  see  Figure  9.  The  shock  wave  changes  the  initial 
conditions,  Pq,  Po,  To,  to  the  new  conditions,  P,  p,  and  T  where  P  is  the  pressure,  p  is  the  density,  and 
T  is  the  temperature.  The  particle  velocity  also  jumps  from  0  to  a  certain  value.  Up.  By  changing  to  a 
Lagrangian  reference  frame  moving  with  the  shock  so  the  shock  front  appears  stationary,  the  particles  seem 
to  approach  the  shock  at  a  speed  of  U,  and  leave  with  a  velocity  of  (Ug-Up).  This  representation  will  be 
used  in  the  impact  problem  described  later. 
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a) 


b) 


Figure  9:  Schematic  of  a  one  dimensional  shock  from  a)  a  stationary  reference  frame  and  b)  a  moving 
reference  frame 


The  analysis  begins  with  the  conservation  equations  at  the  shock  front. 

Conservation  of  Mass: 

PoUs  =  piUs  -  Up)  (5) 

Conservation  of  Momentum: 

(P  -  Po)  =  poU.Up  (6) 

Conservation  of  Energy: 


AtF  =  (PA)iUpdt)  -  iPoA){Uodt) 

AE  =  ^[pAiU,-Up)dt]U^  +  EAp{Us-Up)dt 

[poAiUs  -  Uo)dt]  Ul  -  EoApo(Us  -  Uo)dt 

where  A  is  the  cross-sectional  area  and  Uo  is  the  initial  particle  velocity.  W  is  the  work  done  on  a  particle 
as  it  passes  through  the  shock  front,  and  E  is  the  energy  in  a  material  particle.  Equating  AW  to  AE  and 
setting  C/o=0  yields 

PUp  =  ^p{Us  -  Up)Ul  -  EoPoUs  +  Ep{Us  -  Up) 

Substituting  the  mass  equation,  equation  (5),  yields 

P^p  —  +  poUg{E  —  Eo)  (7) 

By  using  the  mass  and  momentum  equations,  (5)  and  (6),  the  final  form  of  the  energy  equation  is 

•^“■^0  =  ^(P  +  Po)(t^o (8) 

where  v=l/p. 

The  four  unknowns  are  Ug,  Up,  P,  and  E.  With  only  3  conservation  equations,  one  more  equation  is 
needed  for  a  complete  set.  This  equation  is  called  the  equation  of  state  (EOS).  Though  it  is  not  the  same 
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for  a  solid  as  an  equation  of  state  for  a  gas,  it  performs  a  similar  role  in  the  solution,  so  it  has  been  given 
the  same  name^  This  equation  relates  the  shock  and  particle  velocities  through  the  following  relation. 
Equation  of  State  (EOS): 


Us  =  C  +  SUp  (9) 

where  C  and  S  are  material  constants.  These  constants  have  been  measured  and  tabulated  in  many  references. 
By  combining  equations  (6)  and  (9),  a  relation  between  the  shock  pressure  and  the  particle  speed  can  be 


obtained. 

P  =  Po  +  PoiC  +  SUp)Up  (10) 

Equation  (10)  is  very  important  in  impact  problems.  The  method  of  impedance  matching  can  be  used  to 
find  the  shock  pressure  in  the  impact  between  two  dissimilar  materials  [12].  During  impact,  the  pressure  at 
the  interface  must  be  equal.  This  gives  rise  to  different  shock  and  particle  velocities  by  the  equations  above. 
The  particles  in  the  target  travel  at  a  velocity  of  Up2  while  the  projectile  particles  reduce  speed  by  Upi  to  a 
final  value  of  V-Upi ,  see  Figure  10. 


. — ^ 

Upl 

tusl 

|Us2 

— p 

(I) 

Up2 

(a)  tcO  (b)  impact  (c)  t>0 

t=0 

Figure  10:  Schematic  of  impact  problem 
At  the  interface,  the  particle  speeds  must  be  the  same,  so 

V-Upi=  Up2 


and 


Up2  +  Upi  =  V 

Conservation  of  momentum  and  the  equations  of  state  for  the  projectile  and  target  can  be  used  to  find  the 
final  values.  The  conservation  of  momentum  for  the  projectile  and  target  are 

Pi  =  PoiUsiUpi 

I'phe  more  conventional  EOS  for  gases  is  E=Pn/('y-l)-AQ  where  A  and  q  are  reaction  parameters.  This  equation  has  been 
used  for  solids  with  a  constant  7=3  [9]  which  allows  solution  for  detonation  wave  speeds  and  reaction  zone  structures,  but  is 
not  suitable  for  the  impact  problem. 
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Pi  —  P0iUs2Up2 


respectively.  Pq  is  assumed  to  be  0.  This  is  a  good  assumption  as  pressures  can  routinely  reach  above  100 
atmospheres  in  these  analyses.  The  EOS  for  the  two  materials  are 

Usi  =  Cl  +  SiUpi 
Usi  —  C2  +  S2Up2 

Combining  the  equations  yields 

Pi  —  Poi{Ci  +  SiUpi)Upi  =  poiCiUpi  +  poiSiUpi 

Pi  =  P0iiC2  +  S2Up2)Up2 

Using  the  substitution  Upi=V-Up2  and  setting  Pi=P2,  the  following  quadratic  equation  for  Up2  can  be 
derived. 

Cp2  (po2*52  —  poi  Si )  +  Up2  {po2  C2  +  poi  Cl  +  2poi  F)  —  poi  {Cl  F  +  5i  F^)  =  0 
The  roots  of  the  equation  are 

j-r  _  ~{P(iiC2  +  PoiCi  +  2^01-51  F)  ±  v/A 

2{P02S2  -  PoiSi) 

where 

A  =  {PQ2C2  +  poiCi  +  2pQiSiVf  -  A{po2S2  -  PoiSi){-poi){CiV  +  SiV^) 

Two  solutions  for  Up2  will  be  found  due  to  the  quadratic  nature  of  the  equation.  The  determination  for 
which  solution  to  use  is  governed  by  the  fact  that  Up2  must  be  less  than  the  original  projectile  velocity, 
V.  With  Up2  known,  all  other  quantities,  including  the  shock  pressure  P2,  can  then  be  calculated  with  the 
relations  already  discussed. 

A  graphical  solution  of  the  same  problem  is  also  possible.  By  plotting  pressure  versus  particle  velocity 
for  the  projectile  and  target,  the  solution  can  be  read  directly.  Plot  the  target  curve  normally,  and  then  plot 
the  projectile  curve  with  the  origin  at  V,  an  then  invert  the  curve  (change  Up  to  -Up).  Where  the  two  curves 
cross  is  the  solution  to  the  impact  problem.  Figure  11  shows  the  P-Up  plane  for  a  steel  on  TNT  impact. 
TNT  was  used  because  its  behavior  is  close  to  that  of  ’IVitonaJ,  and  the  data  was  readily  available.  The 
impact  velocity  of  1650  m/s  was  used  to  give  a  shock  pressure  P  of  10.7  GPa.  Particle  speeds  in  the  TNT 
are  325  m/s  with  a  shock  speed  of  4200  m/s.  The  same  parameters  for  steel  are  1325  m/s  and  4660  m/s, 
respectively. 

The  impact  speed  of  1650  m/s  was  chosen  above  to  give  the  final  shock  pressure  of  10.7  GPa.  This 
pressure  is  the  initiation  pressure  for  TNT  [7]  for  this  geometry..  This  speed  is  much  higher  than  the  300 
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Figure  11:  Hugoniot  analysis  of  steel  on  TNT  impact 

m/s  velocity  used  in  the  experiment.  Because  the  velocities  are  so  much  lower,  shock  initiation  can  be  ruled 
out  as  a  possible  mechanism  for  reaction  during  the  tests. 

3  Results  &  Discussion 

3.1  Punch  Tests  on  Explosive  Simulants 

The  load-displacement  curves  for  the  simulants  under  quasi-static  loading  rates  are  shown  in  Figure  12. 
For  both  the  Filler-E  and  the  cure  cast  simulant,  there  was  one  test  of  the  three  that  showed  a  slightly 
different  behavior.  This  shows  the  variety  of  material  properties  within  these  materials.  Repetitive  behavior 
may  be  hard  to  obtain  due  to  the  numerous  possibilities  for  imperfections  in  the  materials.  Post  mortem 
examination  revealed  that  the  specimens  did  repeat  failure  behavior  qualitatively.  The  Filler-E  was  very 
brittle.  A  central  plug  formed,  with  radial  cracks  extending  outward  to  the  outer  diameter.  The  cure  cast 
PBX  simulant  formed  a  plug  every  time,  and  did  not  have  any  radial  fractures.  There  were  no  fragments, 
leading  to  the  conclusion  that  the  material  tore.  Average  stress-  strain  curves  for  the  two  materials  axe 
shown  in  the  same  graphs. 

At  low  velocities  on  the  mechanical  press,  there  is  not  much  change  in  the  behavior  of  the  materials.  Both 
Filler-E  and  the  cure  cast  PBX  simulant  showed  the  same  failure  pattern  as  in  the  quasi-static  tests.  The 
mavimiim  loads  in  the  materials  were  higher,  showing  strain  rate  hardening,  which  is  very  large  in  the  case 
of  the  cure  cast  PBX  simulant.  Failure  strains  for  both  the  cure  cast  PBX  simulant  and  Filler-E  reduced  by 
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a) 


b) 


Figure  12:  Quasi-static  load-displacement  and  average  stress-average  strain  curves  for  (a)  Filler-E  and  (b) 
cure  cast  PBX  simulant 


a  factor  of  10  from  the  quasi-static  tests.  In  the  case  of  the  Filler-E,  this  is  most  likely  due  to  fracture  as 
the  compressive  wave  reflects  of  the  back  face  and  becomes  tensile.  Due  to  the  brittle  nature  of  Filler-E,  the 
specimen  cracks  very  quickly  under  the  dynamic  loading.  The  PBX  simulant  seems  to  be  exhibiting  a  large 
strain  rate  dependence  in  both  load  capacity  and  failure  strain.  Load-  displacement  and  average  stress-strain 
curves  from  the  mechanical  press  are  in  Figure  13. 

a)  b) 
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Figure  13:  Low  velocity  load-displacement  and  average  stress-average  strain  curves  for  (a)  Filler-E  and  (b) 
cure  cast  PBX  simulant 

Results  for  a  Hopkinson  bar  tests  with  no  specimen  allows  for  a  verification  of  the  method,  as  well  as  giving 
insight  into  what  results  can  be  expected  during  an  actual  test.  Figure  14  shows  the  raw  data  obtained  from 
the  digital  oscilloscope.  The  top  trace  shows  both  strain  gage  histories,  and  the  second  shows  the  average. 
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Averaging  the  two  eliminates  any  bending  that  may  occur.  Bending  should  be  kept  to  a  minimum,  which 
the  first  trace  shows  occurring  (the  two  strain  gages  curves  lie  almost  directly  on  top  of  each  other).  If  large 
amounts  of  bending  are  present,  the  impact  of  the  bar  with  the  specimen  will  not  be  perpendicular  and  may 
lead  to  erroneous  results.  The  two  strain  gage  have  very  similar  voltage  traces,  verifying  the  absence  of  a 
large  bending  stress.  The  impact  speed  in  this  test  was  approximately  11.5  m/s. 


Strain  Gage  Traces 


Figure  14:  Voltage  traces  for  Hopkinson  bar 


By  placing  the  initial  compressive  and  tension  pulses  over  each  other ,  another  aspect  of  this  experimental 
method  is  revealed.  With  no  specimen  present,  the  two  pulses  should  have  the  same  shape.  However,  wave 
dispersion  has  caused  the  pulses  to  change  shape  [11].  Wave  dispersion  describes  the  phenomenon  that  in 
uniaxial  stress  pulses,  waves  of  different  frequencies  travel  at  different  velocities.  As  a  Fourier  analysis  will 
show,  a  square  pulse  is  made  up  of  many  different  waves  of  various  frequencies  and  amplitudes.  These  different 
waves  start  to  diverge  immediately,  and  thus  the  shape  of  the  wave  changes  as  it  travels.  The  compressive 
pulse  (plotted  as  tension)  and  the  reflected  tension  pulse  can  be  seen  in  Figure  15.  The  magnitudes  of  the 
strain  pulses,  1100/ie,  verifies  that  the  data  follows  the  elastodynamic  relation  V  =  Ee/ pc  derived  in  section 
2.1.1. 

Due  to  the  changing  of  the  shape,  noise  is  generated  in  the  data  calculated  from  the  pulses.  Figure  16 
shows  load,  stress,  strain  rate,  and  displacement  at  the  end  of  the  bar  during  the  pass  of  the  initial  pulse. 
The  load  and  stress  should  be  zero  for  all  time.  The  strain  rate  is  the  average  strain  rate  that  a  specimen 
would  see  as  determined  by  the  displacement  seen  in  the  final  graph.  The  magnitude  of  the  displacement  is 
as  expected  from  knowing  the  length  of  the  pulse,  the  elastic  wave  speed,  and  the  particle  velocity  in  the 
wave. 

The  first  noticeable  error  is  in  the  load  graph.  The  load  is  obviously  not  zero  due  to  the  change  in  shape 
of  the  pulses  caused  by  dispersion.  Similarly,  even  though  the  end  of  the  bar  must  physically  be  a  stress 
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Strain  Pulses  in  Bar 


Figure  15:  Comparison  of  initial  and  reflected  strain  pulses 


Figure  16:  Data  calculated  from  strain  pulses 


free  surface,  the  stress  is  reported  to  be  non-zero.  The  distressing  part  about  the  noise  is  the  magnitude 
of  the  load.  The  load  repeatedly  reaches  values  of  ±20  kN.  While  such  an  uncertainty  may  be  acceptable 
when  testing  metals,  this  is  well  above  the  expected  load  required  to  fail  either  of  the  inert  simulants  and  is 
unacceptable.  The  maximum  loads  reported  so  far  for  failure  are  about  6  kN. 

Results  of  the  Hopkinson  bar  testing  are  shown  below  in  Figure  17.  Post  mortem  examination  showed 
no  difference  in  failure  mode  from  the  tests  conducted  on  the  mechanical  press.  The  impact  velocities  for 
the  Filler-E  and  PBX  simulant  are  11.04  and  12.35  m/s,  respectively.  The  results  do  not  reveal  much  about 
the  behavior  of  the  simulants  at  these  punch  speeds.  The  error  from  dispersion  overwhelms  any  data  that 
may  be  present  in  the  voltage  traces.  These  simulants  are  so  much  weaker  than  any  steel  used  that  the 
strength  of  the  simulants,  or  the  explosives  they  model,  may  be  neglected.  Similar  tests  on  1018  steel  give 
yield  strengths  of  300  MPa.  This  technique  is  better  used  for  metals  which  have  a  much  higher  yield  strength 
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and  elastic  modulus. 


Figure  17:  High  velocity  load-displacement  curves  for  a)  Filler  E  and  b)  cure  cast  simulant 


3.2  Impact  of  Tritonal  and  PBX  N109 

With  the  fracture  characterized  for  lower  velocities,  the  experiments  at  the  AWEF  reveal  the  failure  behavior 
at  very  high  loading  rates  and  their  effect  of  the  initiation  on  the  materials.  As  can  be  seen  in  the  pictures 
taken  with  the  Cordin  camera  (Figures  18  and  19),  both  the  materials  fracture  quickly  and  are  ejected  from 
the  insert.  This  ejected  material  is  not  reacting  at  any  appreciable  rate.  Both  the  materials  behave  similarly, 
but  the  Tritonal  seems  to  have  more  material  ejected.  Tritonal  is  a  brittle  material  that  fractures  quite 
easily.  With  the  initial  impact,  the  Tritonal  forms  fragments  which  are  then  ejected  with  further  projectile 
motion.  The  PBX-N109,  on  the  other  hand,  is  a  weak  but  more  ductile  material.  It  offers  little  resistance 
to  the  projectile  motion,  but  is  ejected  in  larger  pieces  from  the  insert. 

None  of  the  test  shots  caused  detonation  of  the  specimens.  The  pressure  generated  in  the  shock  wave 
by  the  impact  is  1.23  GPa  for  the  Tritonal,  and  even  less  for  the  PBX-N109  as  determined  by  the  Hugoniot 
method  [12].  From  Pop  plots  for  TNT  [7],  these  pressures  will  not  cause  shock  detonation  for  a  specimen 
0.25”  thick.  Limits  of  the  gun  prevented  higher  impact  speeds. 

In  other  tests,  a  cover  plate  of  titanium  6%  Al-4%  V  alloy  was  used  to  determine  if  containment  and 
possible  heating  from  the  plate  would  cause  detonation.  Again,  there  was  no  reaction. 

To  get  a  detonation,  the  reactive  material  must  be  contained  and  compressed  [3].  In  this  experiment,  it 
is  too  easy  for  the  fractured  material  to  escape;  there  is  negligible  internal  friction  and  negligible  hydrostatic 
pressure  in  the  shear  zone.  High  pressure  combined  with  shear  friction  could  easily  lead  to  ignition.  Boyle 
et  al.  [3]  performed  tests  that  combined  pressure  and  shear  showing  that  TNT  would  ignite  for  velocities 
around  80  m/s  with  a  pressure  of  0.5  GPa.  The  method  of  initiation  in  Boyle’s  tests  is  probably  internal 
friction  as  the  tests  with  steel  on  explosive  and  explosive  on  explosive  boundaries  made  no  difference  in 
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Figure  18:  High  speed  photography  of  PBX  N109  impact,  approximate  times  after  impact  are:  1)  36us,  2) 
116/Lts,  3)  180/us,  and  4)  272/is 
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Figure  19:  High  speed  photography  of  Tritonal  impact,  approximate  times  after  impact  are:  1)  28/is,  2) 
112/xs,  3)  184/xs,  and  4)  252^s 
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the  outcome.  The  main  difference  between  these  experiments  and  Boyle’s  is  that  in  Boyle’s  tests  there  is 
an  apphed  hydrostatic  pressure;  clearly  the  existence  of  a  hydrostatic  pressure  is  critical  to  the  dominant 
ignition  mechanism  in  that  case. 

During  the  experiments  conducted  at  the  AWEF,  the  technique  of  turning  off  external  lights  to  capture 
light  given  off  by  the  reaction  was  not  used.  Since  very  little,  if  any,  reaction  occurred,  it  is  expected  that 
no  hght  was  emitted. 

Chou  [5]  has  run  numerical  models  on  similar  experiments,  one  of  which  included  an  impact  of  TNT 
by  a  steel  projectile  at  200  m/s.  In  his  calculations,  he  shows  that  a  shear  band  forms  at  the  corner  of 
the  projectile.  Temperatures  reach  500°F.  These  temperatures  are  not  high  enough  within  the  band  to 
cause  ignition.  At  impact  velocities  of  1000  m/s,  pressures  and  temperatures  become  high  enough  to  cause 
detonation  for  his  model.  What  Chou  does  not  take  into  account  is  the  fracture  of  the  material.  As  seen 
in  the  photographs,  the  Tritonal  fractures  very  quickly.  Even  with  cover  plates,  the  materials  is  no  longer  a 
continuous  solid,  but  a  granular  material  and  should  be  modeled  appropriately.  Streak  camera  photographs 
have  shown  that  detonation  waves  in  pressed  or  cast  explosives  are  rough,  indicating  that  the  flow  is  irregular 
in  its  fine  detail.  Initiation  thus  depends  strongly  on  the  type,  number,  and  distribution  of  inhomogeneities 
[4].  By  fracturing  these  materials,  the  initiation  behavior  may  change  dramatically. 

4  Conclusions 

For  both  Filler-E  and  the  cure  cast  PBX  simulant,  the  ultimate  strengths  of  these  materials  are  far  below 
anything  used  for  a  bomb  casing,  usually  4340  steel.  When  modeling  the  mechanical  strength  of  the  weapon 
as  a  whole,  the  explosive  can  be  neglected  as  it  will  not  affect  the  outer  casing’s  behavior  due  to  its  negligible 
relative  strength. 

From  the  behavior  of  the  Filler-E,  it  can  be  inferred  that  that  upon  impact,  Tritonal  will  fracture  very 
quickly.  In  a  deep  earth  penetrator,  the  explosive  may  even  become  a  granular  material  before  ignition 
occurs.  This  may  hamper  or  improve  ignition  of  the  explosive  depending  on  loading  conditions.  Penetration 
of  the  weapon  by  a  foreign  object  could  have  disastrous  effects.  The  fracturing  and  then  compression  of 
Tritonal  could  lead  to  ignition.  Internal  friction  is  greatly  increased  after  the  fracture  occurs  if  a  large 
hydrostatic  pressure  is  present.  All  the  free  surfaces  are  then  allowed  to  rub  against  each  other,  greatly 
increasing  the  risk  of  premature  initiation. 

The  cure  cast  PBX,  which  does  not  fracture  at  lower  loading  rates,  also  fails  into  many  small  particles 
upon  impact.  The  cure  cast  simulant  showed  a  large  amount  of  strain  rate  hardening  and  decrease  in  failure 
strain  between  the  quasi-static  and  low  velocity  tests,  and  this  trend  seems  to  continue  to  the  very  high 
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velocities  for  PBX  N109.  Though  load  carrying  capacity  at  this  loading  rate  is  not  known,  the  failure  mode 
is  much  closer  to  that  of  the  Tritonal.  Again,  the  effect  on  the  initiation  from  any  failure  is  unknown,  but 
internal  friction  with  a  large  hydrostatic  pressure  will  be  the  primary  cause  of  initiation  at  these  loading 
rates. 


Nomenclature 


r 

z 

Ur 

Uz 

V 

w 

db 

h 

c 

E 

€ 

r 

7 

7 

Vi,  V, 

U« 

Up 

P 

T 

P 

W 


radial  direction 

direction  along  punch  axis 

displacement  in  r  direction 

displacement  in  z  direction 

specific  volume  in  Hugoniot  analysis 

shear  width  (clearance) 

bar  diameter 

bar  area 

specimen  thickness 
sound  velocity  in  punch 

modulus  of  elasticity  in  punch,  energy  in  Hugoniot  analysis 

axial  strain  in  punch 

shear  stress  in  specimen 

shear  strain  in  specimen 

shear  strain  rate  in  specimen 

incident  and  particle  velocity  in  Kolsky  bar 

shock  velocity 

particle  velocity  in  projectile  or  target  for  impact  problems 

pressure 

temperature 

density 

work 
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